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Printed in Great Britain 


A comparison of theoretical and empirical results forsome ,7 
stochastic population models* i, 


By M. 8. BARTLETT, J. C. GOWER anv P. H. LESLIE G/3 A 


Statistical Laboratory, University of Manchester ; Statistical Department, Rothamsted Experi- 
mental Station ; Bureauof Animal Population, Department of Zoological Field Studies, Oxford 


1. GENERAL REMARKS 


In recent papers Bartlett (1957), Leslie (1958) and Leslie & Gower (1958) have illustrated 
by means of artificial series the properties of various idealized models of biological systems, 
including the single-species logistic stochastic process and two-species extensions. While 
these artificial series are always useful in an auxiliary qualitative sense, the theoretical 
intractability of many of these models has given the artificial series a somewhat more 
important role than they might otherwise have had. Nevertheless, what theoretical results 
there are should not be neglected, and indeed some of these were compared with empirical 
results from series in the last two of the papers mentioned above. It is the purpose of the 
present paper to indicate somewhat more systematically where theoretical results, even 
when only approximate, may be useful, and to make some further comparisons with the 
empirical results available. 


PART I. THEORETICAL RESULTS 
2. SINGLE-SPECIES MODELS (CONTINUOUS TIME) 


Consider first a stochastic population model for a single species, with transition prob- 
abilities (in continuous time during the infinitesimal interval dt) A,,dt of a ‘birth’, and y, dt 
of a ‘death’, where n is the total population size. A ‘death’ may include emigration, but 
unless a ‘birth’ can include immigration, A, = 0. If A, = 0, an ultimate stationary dis- 
tribution for n cannot strictly exist, but may effectively exist over all realizable time- 
intervals (see §5; also Leslie (1958), Bartlett (1960)). Under conditions for which a sta- 
tionary (or quasi-stationary) distribution does exist, the probability distribution for it 
must satisfy the recurrence relation 


Hy P(n) a AnaP(n—- 1) (1) 


(see, for example, Bartlett, 1960), from which relation the exact distribution P(n) may 
always be calculated numerically, as will be illustrated below (§6). Under some further 
conditions which include m, m/ao> 1, we have asymptotically 


P(n) ~ Cexp{—}(n—m)?/o%}, (2) 
where m is the relevant solution of A,,, = .,; and 
ot=-1 / ieee ; (3) 
dn n=m 


* The work of one of the authors (M.S.B.) was supported in part by a research contract between 
the Office of Naval Research and the Department of Statistics, Harvard University. 
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If alternatively the properties of small fluctuations about the mean of the stationary 
distribution are investigated directly, the results obtained to the first approximation will 
be equivalent to the normal approximation above. As the first mention of this approach 
(Bartlett, 1956) was rather brief, it seems worth while showing how it may be developed to 
include second-stage (or even higher order) corrections. The procedure is sufficiently illu- 
strated by means of the logistic model 

A, = a,n—b,n?, pw, = agn+b,n?, (4) 
where A, remains zero when 7 > 4,/b,, (a,,6,,42,6, > 0). The stochastic equation (cf. 
Bartlett, 1957) is 

dN, = (Ay—My) dt +dZ,—d2p, (5) 
or Mat = N+ (Ay — My) dt + dZ, —dZ,. (6) 


Averaging (6) on the assumption that a stationary distribution has been reached, we have 
(from the coefficient of dt) 


(4, — 4) m — (b, + bg) (0? + m?) = 0, (7) 
where m = E{N}, o? = E{(N —m)?}. Write further 6N, = N,—m; then 
ON i.a = N+ (Ay — My) dt + dZ, —dZ,. (8) 


Squaring and averaging this equation, we have exactly 
2[ (4, — ag) 0° — (6, + bg) Mg] + [(Ay + Ag) m — (by — bg) (m* + o*)] = 0, 
where /, = E{(dN)3}. Hence to the first order of approximation (noting that 
m ~ (d,—4p)/(b, +b), fs ~ 0 
to this order) a? ~ (a,—b,m)/(b, + 5,). (9) 


Similarly from the cube of (8) we obtain, noting that the averaged cube of dZ,—dZ, is 
strictly zero at N = m, 


3E{(Ay — ty) dt(8N)%} + 3E{(dZ, — dZ,) 5N} = 0. 


From the normality approximation, we can to the second order of approximation write in 
this equation ~, = E{(éN)*} = 304. We thus obtain 


(4 — 4g) (Mg + ma*) — (b, + bg) (m?a* + 2mp, + 304) = — (a, + dg) o? + (b, — by) (2mo? + 45), 
whence to the same order, as m > 1, 
fgm(b, +b.) ~ o?m(b,—5,), 
that is fs ~ O7(bg—b,)/(bg + 0,). (10) 
It will be seen that j, is only zero to this order if b, = b,; and it changes sign as we move 
from a constant birth-rate (b, = 0) to a constant death-rate (b, = 0). 


3. SINGLE-SPECIES MODELS (DISCRETE TIME) 


Before, however, we consider comparing any of these results with empirical results 
obtained from Leslie’s artificial series, we must recall that the latter were obtained on the 
basis of a discrete-time model (Leslie, 1958), which has its own theoretical distribution. 
Whilst its exact form would be complicated, the investigation of approximative moment 
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formulae proceeds very similarly, as was noted by Leslie & Gower (1958) in the case of a 
two-species model. We shall illustrate the procedure here for a single-species model, taking 
it to the next stage of approximation. 

The transitions in Leslie’s model are obtained from the recurrence formulae 


E{Na} = €- N, 
oN} = ed oy 


i: 1) eh N; 
treating b, and d, constant from ¢ to t + ; tie assuming also AN, normal (with the restriction 
Nii > 0). Putting b,—d, = log A,, we have in the recurrence relation 
Nia = f(M) + Zi (12) 
f(™) = A,X, in this case, where A, for the logistic model is of the form A/(1+aN,). Putting 
oN, = N,—m, where m = E{N} under stationary conditions, and writing 





AUN) = fm) +N A + yay E+ 


where of/dm denotes Of/0.N, at the value N, = m, etc., we have in the first approximation 


m = f(m) = (A— 1a, (13) 
a a ee | 
1—(of/om)?  1—(1/A)?’ (14) 
Ps = 9. ) 
eee 1 of 
To the next approximation f(m)-—m+ ami” = 0, 


where of/Om = 1/A, 0?f/Om? = — 2a/A?; and 
Fl 
py = B{ (ane + seme o 244 Zu) | ~ nsZ+ (ZL) te 


2 92 
£3 (LZ) Set) +38 loti mp (1 + s100my-07154)), 
Now p4,(Z) = 0, 44 ~ 30%, and it only remains to evaluate such terms as E{o*(Z,,, | N,) dN}. 
It should be noted that the value of this expression may depend on the precise numerical 
procedure adopted in obtaining the artificial series. Thus if 0?(Z,,, | N,) were taken constant, 
say at the value o°(Z,,, | m), the whole of the last term in the expression above for 1, would 
be zero. However, if, more accurately, we expand o°(Z,,, | N,) in (11) in the neighbourhood 
of N, = m (where 6, ~ d,), we find 


(Ziza | M) ~ (2b—(30—1) adN/A) N, (15) 
when the birth-rate is constant (b), and 
o%( Zaza |_N) ~ (2d — (8d +1) BNA) N, (16) 


when the death-rate is constant (d). These results, incidentally, may be useful as approxi- 
mations for o*(Z,,,) when artificial series are being constructed. Thus, retaining terms of 
the appropriate order, we find for b, = 6, 


pa(h — (1/A)?) ~ BAAD (COVA) _ (3p a)}, (17) 


1-2 
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Similarly in the case of constant death-rate (d), 


pal —(1)2)%) ~ FATED (CO?) _ (3a 1). (18) 


It may be useful to summarize the formulae derived in §§ 2 and 3. 


Logistic model; continuous time 
A, = a,n—b,n*; w, = a.n+b,n?, 
Variance (Ist approx.) 0? ~ (a, —b,m)/(b,+,). 
Mean (2nd approx.) m’ ~ m—o?/m, where m = (a, —4,)/(b, + 5,). 
Skewness (2nd approx.) #3 ~ 07(b, —b,)/(b.+5,). 


Discrete time 
A; = A/(1+aM), (i) birth-rate constant, 5; (ii) death-rate constant, d. 
(i) Variance (Ist approx.) 0? ~ 2bm/{1—(1/A)?}, where m = (A—1)/a. 
Mean (2nd approx.) m’ ~ m—o?/(Am). 
Skewness (2nd approx.) 


30°A(A— 1) (2b(A + 2) 
Mem a ag OD}: 


(ii) Variance (1st approx.) 0? ~ 2dm/{1—(1/A)?}. 
Mean (2nd approx.) m’ ~ m—o?/(Am). 
Skewness (2nd approx.) 

307A(A— 1) foe +2) 


a 5 i Me 8a+9)}. 





4. MODELS WITH TWO SPECIES 


Whilst the above methods are available for models with two (or more) species, the 
formulae get rather complicated; as the first approximation results in the case of a quasi- 
stationary distribution for two species have already been indicated by Leslie & Gower 
(1958), they will not be listed here. With regard to the exact recurrence relation (1) for the 
distribution P(n), the corresponding relation for two species is easy to write down, but has 
no simple method of solution. However, under conditions for which a well-defined stationary 
(or quasi-stationary) distribution exists with m;,m,/o; > 1, the distribution will be approxi- 
mately bivariate normal, with its moments given by the approximate formulae already 
referred to. 

In the case of models for which one species will become extinct, it is also easy to write 
down the equation satisfied by the extinction probability, say, p(n, n’) for the first species, 
ifn and n’ are the initial numbers of the two species. Thus consider a continuous time model 
with birth- and death-rates: 


Ist species 2nd species 
A(n,n’) = a, —b,n—c,n’ A(n,n’) = a, —bjn’ —cjn 
Mn, n’) = ag+b.n+c,n’ b(n, n’) = ag+bgn’+egn 


abi 


18) 
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The equation for p(n, n’) is obtained from the differential equation for the chance of extinc- 
tion pn, n’) by time t, and letting t + 00; this equation for p,(n, n’) is readily derived from 
the possible transitions in the first infinitesimal interval dt. We obtain 


nA(n, n') [p(n + 1, n’)— p(n, n’)] + nu(n, n') [p(n — 1, n’)— p(n, n')] 
+n'X'(n,n')[p(n, n' + 1)—p(n, n')]+n'p'(n, n') (p(n, n’—1)—p(n,n’)] = 0, (19) 


with boundary conditions p(n,0) = 0, p(0, n’) = 1. While it is possible to solve this equation, 
for example, by iteration, in any actual example, it seems quicker to obtain approximate 
answers by Monte Carlo methods, as shown by Leslie & Gower (1958). 


5. RECURRENCE TIMES 


If it is desired to check recurrence times to any state S in a stationary process with a 
finite number of states, the relevant formulae have been given by Bartlett (1955, § 6.41). 
Thus in the case of discrete time, the mean recurrence time is 


ee a , 20 

1 P®)[I-PS|S)] ied 

where P(S | S) denotes the conditional probability of S at one instant, given S at the previous 
instant. In the case of continuous time, if P(S |S) for times separated by an interval dtis 
1 —edt + 0(dt), then 1—P(8) 


This formulae is relevant in assessing the passage-time to the zero state S for population 
processes with an ‘absorbing barrier’ at this state, for if we insert a fictitious escape prob- 
ability edt in dt, we have from the equilibrium for the state N = 1, 


J P(1) = eP(0), (22) 
1—P(0) 
whence _= BP)’ (23) 


Under conditions for which P(0) is small, the quasi-stationary distribution P(n), (n > 0), 
exists approximately independently of ¢, and under such conditions the mean recurrence 
time 0, ~ 1/[#,P(1)] gives the order of magnitude of the passage-time to zero. For processes 
of the type discussed in § 2 such passage-times may be so large as to be considered infinite 
(cf. Leslie, 1958). 

PART II. NUMERICAL RESULTS 

6. EXAMPLE OF DISTRIBUTION P(n) 

It was noted in § 2 that the recurrence relation for P(n) enabled P(n) to be calculated 
exactly. This is strictly true only if the zero state is not absorbing, but from the last section 
P(n) is effectively defined in quasi-stationary cases also. Whilst it is still a purely theoretical 
result, we give a numerical example of P(n) in this last case, for the logistic model 


a, = 0-8077, 5, = 0-006932, 
a, = 0-1145, b, = 0. 


P(n), standardized to a total of 4975, was found to have the values given in Table 1. 
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Table 1 (giving f = 4975 P(n)) 


n=81 f= 01 n= 93 f=118-3 n= 105 f= 236-7 
82 0-2 94 166-7 106 163-4 
83 0-5 95 224-7 107 103-1 
84 1-0 96 289-7 108 58-8 
85 1-9 97 356-0 109 30-1 
86 3-7 98 416-3 110 13-5 
87 6-7 99 461-8 111 5:3 
88 11-8 i00 484-8 112 1-7 
89 20-2 101 480-1 113 0-5 
90 33-2 102 446-8 114 0-1 
91 52-7 103 388-9 
92 80-6 104 315-1 Total 4975-0 








The constants of this distribution as calculated by the approximate formulae of § 2 agree 
very well with the exact results: 


Exact Approximation 
mM: 99-83 99-83 (2nd approx.) 
or: 16-71 16-52 (1st approx.) 
fg: — 16-61 — 16-52 (2nd approx.) 


7. COMPARISON OF APPROXIMATE MOMENTS FOR DISCRETE-TIME 
MODEL WITH EMPIRICAL RESULTS 


Four empirical distributions for a logistic process in the region of the stationary state 
were built up on the Elliott—N.R.D.C. 401 computer at Rothamsted Experimental Station, 
taking the values of the parameters in the discrete-time model 
a. 
1+aN, 
given in Table 2. In each case m = (A—1)/a = 100, and it will be noted that the unit of time 
in Id is }th of that adopted in the remaining three models. 


E{N,+1 | Nj = N=AN (A= ec), 


Table 2 
Constant Constant 
Model A a birth-rate (b) death-rate (d) 
la 2-0 0-01 1:0083 = 
Ib 1-1487 0-001487 0-2017 —- 
Ila 2-0 0-01 : —- 0-1145 
IIb 2-0 0-01 — 0-3151 


The programmes used for computing Ia, IIa and IL6 were originally written for a system 
of two competing species (Leslie & Gower, 1958), but by putting two of the parameters 
equal to zero (cf. the equations given later in § 8), these could be used for computing simul- 
taneously a pair of logistic processes. In order to simplify these programmes, however, 
certain approximations had been made to var (N,,,|N,). Thus, if the expression for the 
variance in (11) is written as o*{N,,1} = GE{N,,.}, 


where, when the birth-rate (b) remains constant (BRC model), 


¢@ (?- 1) (4-1), (24) 





and v 
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and when the death-rate (d) remains constant (DRC model), 


2d 

é=(=+1)a-D, (25) 
t / 

then it may be shown empirically (Leslie, 1958), by tabulating ¢ over a relatively wide 

range of possible values of A, (or log, A, = r;), that in the BRC model we have, when A = 2-0 


and constant (6) = 1-0083, 
ns o*{Ns1} ~ 20M 4} (26) 
and in the DRC model for 


constant (d) = 0-1145: o{N,,,} ~ (—0-87+ 1-10A,) weet (27) 


constant (d) = 0-3151: o{N,,,} ~ (—0-66 + 1-29A,) E{N,,;}. 
These empirical approximations, which should hold over the entire development of any 
process with the given parameters, are closely related to (15) and (16) for systems in the 


neighbourhood of the stationary mean. For, expressing the latter in terms of H{N,,, | Nj}, 
we have in the region of N, ~ m, 


N~ (14 58M) BO}, A, ~ 1- FON, 


A  Saligs 
and (15) and (16) can be written, respectively, as 
o*{Zi41| Nj} ~ (+1) + (6-1) A) FW,}, (28) 
o*{Z1.1| Ni} ~ [((d—1) + (d+ 1) Ay) EM, 3}. (29) 


Thus, in the BRC model, when 6 ~ 1, 
o{Z141| Nj} ~ 2E{N.1}; 
while in the DRC model, for d = 0-1145, 
o°{Z,,, | Nj} ~ [—0-8855 + 1-1145A,] E{N,,4}, 
and for d = 0-3151 o*{Z,,| Nj} ~ [—0-6849 + 1-3151A,] E{N,,}, 


which correspond very closely to (27). 

In the remaining model (Id), a programme was used in which the ‘exact’ expression (24) 
for ¢ was incorporated. This has the advantage that the choice of the constant birth-rate 
(b) is not restricted to only a very limited range of values, as in the case of the approxima- 
tion (26). 

The observed moments of these computed distributions, together with those expected 
from the theoretical approximations for the discrete-time (D-T) and equivalent continuous- 
time (C-T) models, are shown in Table 3. 

The agreement between the observed moments and the theoretical approximations for 
the discrete-time model seems very satisfactory, considering the errorsinvolved in estimating 
the former, even in samples of this size. The standard errors quoted are in general classical 
values, ignoring the serial correlations between the successive observations, and repre- 
senting lower limits to the correct values. The correlation p, between successive observa- 
tions is to the first approximation 1/A, and the correcting factors to the standard errors of 
the mean and standard deviation are respectively ,/{(1 +,)/(1—/)} and ./{(1 + p3)/(1 —pj)}. 
Where the differences between the observed and theoretical means or standard deviations 
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exceed twice the classical standard error, the corrected standard error is shown in brackets, could 
and it will be seen that the one such difference no longer appears anomalous. The corrections for th 
to the standard errors of moments to allow for dependence become more complicated for 
the higher moments, but are not needed for the y, values in Table 3 (formulae for them may 
be ascertained if required from Chanda (1958)). The o 
frequ 
Table 3 
Approximations 
No. of c Xx \ Observed 
Model observations Moment (C-T) model (D-T) model (D-T) model 
Ta 9950 Mean 98-54 98-66 98-60 + 0-16 
o 12-06 16-40 16-47 + 0-12 
v1 0-083 0-035 0-072 + 0-025 
Ib 5240 Mean 98-54 98-55 98-63 + 0-17 
o 12-06 12-90 12-53 + 0-12 (0-33) 
Vs: 0-083 0-076 0-053 + 0-034 
Ila 4975 Mean 99-83 99-85 99-97 + 0-08 
o 4-06 5-53 5-51 + 0-06 
V1 — 0-246 —0-161 — 0-182 + 0-035 
Ib 4975 Mean 99-55 99-58 99-44 + 0-13 
Co 6-74 9-17 9-16 + 0-09 
vy; — 0-148 — 0-103 — 0-103 + 0-035 
Table 4 
N S(N) N S(N) N f(N) 
77 1 92 138 107 188 I 
78 0 93 166 108 120 
79 0 94 194 109 90 ap] 
80 1 95 213 110 64 me 
81 2 96 270 111 54 the 
82 5 97 285 112 25 
83 3 98 310 113 17 Ta 
84 4 99 329 114 14 
85 9 100 350 115 6 de 
86 20 101 361 116 0 
87 33 102 357 117 0 
88 35 103 339 118 1 
89 51 104 318 
90 67 105 261 
91 97 106 177 Total 4975 th 
As an illustration, we give in Table 4 the observed frequency distribution for model ITa, 
corresponding to the continuous-time model for which the exact distribution is given in § 6. T 
The main difference between this observed distribution and the exact form for the al 
equivalent continuous-time model is in the scale of the variance o* (one notes in passing tl 
that in both cases “, ~ —o*). It is, however, always possible to make the variances of the ti 
two types of model more in agreement by adopting a smaller unit of time in the discrete- 
time model, as is illustrated in the cases of Ia and Ib in the above table. 
It may also be of interest to consider the recurrence times which were observed in this a 
set of realizations for model Ila. Regarding the occurrence of a particular integer as a V 


specified state S being occupied, the mean life-time spent in the state S, 7, = 1/{1— P(S | S)}, 
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could be determined from the typed lists of results, and hence the mean recurrence time 
for the state S, j= P(S) 


P(S) 
The observed values of 7', and ©,, neglecting the tails of the distribution where the observed 
frequencies became small, are given in Table 5. 


0,=T, 


Table 5 

Normal Normal 

Observed approxi- Observed approxi- 

State pot mation State —_--—_ mation 

S f, 0, 0, S 0, 0, 

85 1-1250 623-0 551-7 100 1-0870 14:3 13-8 
86 1-0000 247-7 343-2 101 1-0841 13-9 14-1 
87 1-0312 154-5 220-5 102 1-0593 13-7 14-8 
88 1-0294 145-2 146-4 103 1-1042 15-1 16-1 
89 1-0851 104-7 100-5 104 1-1042 16-1 18-0 
90 1-0635 78-0 71-2 105 1-0830 19-6 20-9 
91 1-0211 51-4 52-2 106 1-0727 29-1 25-0 
92 1:0455 36-7 39-5 107 1-1124 28-4 30-9 
93 1-0573 30-7 30-9 108 1:0526 42-6 39-5 
94 1-0838 26-7 25-0 109 1:0465 56-8 52-2 
95 1-1036 24-7 20-9 110 1-0159 77-9 71-2 
96 1-0887 18-9 18-0 111 1-0000 91-1 100-5 
97 1-0634 17°5 16-1 112 1-0870 215-2 146-4 
98 1:0954 16-4 14:8 113 1-0625 309-8 220-5 
99 1:0717 15-1 14-1 114 1-0000 354-8 343-2 


If the conditional probability of S, given S at the preceding instant, P(S |S) ~ P(S), as 
appears very roughly to be the case in this example (e.g. for the ten states disposed sym- 
metrically about the mean, the average P(S | S) = 0-0785 and the average P(S) = 0-0630), 
then @, ~ 1/P(S). 

Taking the first approximations to the moments of the discrete-time distribution, when the 
death-rate remains constant, viz. 
mean = (A—1)/a = m, 
o? = 2dm/{1—(1/A)*}, 
fs = 9, 
then the normal approximation is 


©, ~ ,/(27) cexp [| 


These figures are given in the third column of the above table, and it will be seen that 
although the departures from normality of the actual distribution are appreciable, yet on 
the whole the approximation indicates the order of magnitude of the observed recurrence 
times, more particularly for the states which are less than the mean value. By extrapolation 


1/P(1) ~ 7-0 x 107, 
and it is evident, without proceeding any further, and without taking the approximate 


value too literally, that the probability of random extinction for this system is negligible, 
even in the case of the discrete-time model with its larger variance. 
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8. COMPARISON OF APPROXIMATE MOMENTS WITH EMPIRICAL 
RESULTS FOR A TWO-SPECIES SYSTEM 


BaRTL 

Two bivariate distributions were computed for a system of two competing species ~~ 
fluctuating in the region of the stable stationary state. It was assumed in both cases that BARTL 
the death-rate (d) of each species remained constant, and in the set of deterministic equations Bion 
defining the expectations for this type of system (Leslie & Gower, 1958), ~~ 
ALM (Et) CHAN! 

E{N,(¢ + 1) | N,(¢), Ne(t)} = ——~3-2 . (30) tim 

a 1-460), MO} 1+, N,(t) + 2, N,(t) came 

A,N,(t) nev 

E{N,(t + 1) | N,(t), N,(é)} = ——2*" —___, (31) LESuI 

E+ DIO. MO = TaN) + Be 


the following parameters were adopted in System A, 
A, = 2:5, d,=0-1145, a, =0-008, 2, = 0-003, 
Ap, = 20, d,= 03151, a, = 0-00625, f, = 0-0025. 


In the second case, System B, the same set of parameters was used, except that the value 


of «,, was changed to a, = 0-005. The computed distributions were based on 1000 observa- 
tions for A, and on 995 for B. 


The first approximations to the moments are (assuming that both N, and N, are dis- 
tributed normally about the stationary state) 


didi XA — By By . (32) 


and for the marginal distributions, 


ftg(,) = 4g(Ng) = 0; (34) 


while the variances and covariances may be obtained from the solution of the equations 
which have already been given for the discrete-time model (Leslie & Gower, 1958, § 5). 
The results are shown in Table 6. 








Table 6 
System A System B 
Approximation Observed Approximation Observed 
Mean (N;) 150-0 150-82 268-4 270-16 
Mean (N,) 100-0 99-19 52-6 50-75 
o(N,) 7-84 7-85 11-06 11:37 
a(N,) 11-49 12-03 11-68 12-42 
p(Ny, N,) — 0-364 — 0-446 — 0-529 — 0-602 
¥(N;) 0 +0-001 0 — 0-005 
Yi(N2) 0 + 0-074 0 — 0-022 


The agreement between the approximations and the observed means, standard devia- 
tions and correlation coefficients seems very satisfactory in both cases; while the magnitude 
of the skewness coefficients suggest that both distributions could be regarded as approxi- 
mately bivariate normal in form. 
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Birth-and-death processes, and the theory of carcinogenesis 


By DAVID G. KENDALL* 
Magdalen College, Oxford 


1. IyTRODUCTION 


In March 1959, Professor J. Neyman pointed out to me that some calculations of mine 
(Kendall, 1952) relating to phenotypically delayed mutations could be adapted so as to form 
a very rough quantitative model for a current theory of carcinogenesis, and he kindly sug- 
gested that I should explore this possibility and report on it to the San Diego session of the 
Biometric Society in June 1959. I wish to begin by expressing my gratitude to him for 
suggesting this problem and for sketching the lines along which it might be tackled. 

We first outline the biological situation and describe the model which is intended to 
represent it. To begin with we suppose that we have a large (say infinite) population of 
normal cells subject to carcinogenic action (perhaps by irradiation) usually of two kinds; 
a ‘background’, which is always present, and an experimental enhancement of this of known 
intensity and acting over a known period. We suppose that the joint effect of these agents is 
to transform some normal cells into first-order mutants, which it will be convenient to call 
grey cells (simply in order to have a short name for them). Quantitatively we assume that in 
a time interval (¢,,¢,) the number of grey cells produced in this way is a Poisson variable of 


expectation ty 
f(é) dt, (1-1) 
4 


where f(t) represents the intensity of the carcinogenic action (of both kinds). In most experi- 
mental situations f(t) will be a step-function. 

Each grey cell will be supposed to generate a clone (identified with a benign growth) the 
individuals in which multiply independently with birth rate A and death rate ~ as in the 
simple birth-and-death process (for which see, for example, Kendall, 1952). An important 
assumption is that is greater than A; this ensures that the grey clones each become extinct, 
with probability one, if given sufficient time in which to do so. Thus benign growths are here 
being represented by subcritical birth-and-death processes. 

It is next supposed that as a result of the same carcinogenic influences the grey cells are 
capable of being transformed into second-order mutants, which for ease of reference we shall 
call black cells. Two alternative assumptions can be made here: 

(A) the effect of a second-order mutation is to transform a single grey cell into a black cell; 
such mutations occur independently of the birth-and-death process controlling the growth 
of the clone to which the grey cell belongs; 

(B) second-order mutations occur (if at all) at epochs of cell division, and their effect is 
to convert one of the two fission products into a black cell, so that we have 


grey — grey + black. 
We could (but will not now) add a third variation of the model, in which when a second-order 
mutation acts at the moment of a cell division, both fission products are black. 


* This paper was prepared with the partial support of a research grant from the National 
Institutes of Health, U.S. Public Health Service. 
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Once a black cell has been formed, it generates a black clone developing according to 
a birth-and-death process, but now the birth rate L and death rate M satisfy the reversed in- 
equality: M < L. This means that we are identifying malignant growths with supercritical 
birth-and-death processes. The chance that a black clone eventually becomes extinct will be 
M/L; we could make this zero, if we wish, by setting M = 0. It is known, in any case, that if 
a population controlled by a supercritical birth-and-death process does not become extinct 
then its size grows without limit, ultimately exponentially.* 

We shall assume that for a grey cell extant at the epoch ¢ there is a chance A dt that it will 
split into two grey cells during (t,t + dt) and that there is a chance , dt that it will ‘die’ (and 
then cease to be numbered among the grey cells) during the same time period. (There is 
plainly an alternative assumption possible here; we could assume that after ‘death’ a grey 
cell is still available to be counted as grey by the experimenter, and that it is dead merely in 
that it no longer multiplies and is no longer available for conversion into a black cell. It 
would be quite easy to incorporate this assumption in the model, if desired. We shall indicate 
later how it could be done.) 

We have still to specify the probability with which a second-order transition involving 
a given grey cell will take place in a time interval (t,t +dt). If we are using Assumption (A), 
then we write v dt for the chance that the grey cell will blacken in the stated intervai. If we 
are using Assumption (B), we write vdt for the chance that the grey cell will be converted 
into a grey cell plus a black cell during the stated interval. These probabilities are of course 
only intended to be correct up to terms in o(dt), and multiple events are supposed to have 
probability o(dt), so that in very small intervals the four possibilities are (i) a A-transition, 
(ii) a w-transition, (iii) a v-transition, and (iv) no change. 

The birth-and-death rates A, 1, L, and M are supposed to be constants. We may or may 
not wish pv to be treated as constant. If it is time-dependent, the natural procedure would be 
to put v = vy f(t), where f is the function already used to measure the intensity of the car- 
cinogenic action at various times. Possibly the occurrence of second-order mutations will 
depend on the carcinogenic agents in a different way. For example, if the background 
agents can produce second-order mutants but the experimental agents cannot, then it 
would be more correct to treat v as constant. All we can do for the moment is to keep in 
mind the possibility that vy may have to be time-dependent. 

We have now explained the model of the phenomenon, but we still have to complete the 
model of the experiment. Presumably this will consist in the counting of black and grey 
clones after the end of the exposure period, but certainly not all clones will be counted. Some 
will be missed; it is more likely that small clones will be missed than that large ones will be; 
thirdly, we cannot suppose that the grey and black clones of equal size will be detected with 
equal frequency. We must therefore introduce a function C,(n) giving the probability that 
a grey clone of size n will be counted by the experimenter, and a similarly defined function 
C,(n) for the black clones. Neyman has suggested that one might take 


C,(n) =1-y", O(n) = 1—f", (1-2) 
where f and y are positive numbers less than unity. This is a plausible assumption, because 


we might argue that if £ is the chance that a single black cell will be missed, then /” will be 
the chance that a clone of size n will be missed, and 1 — £” will be the chance that it will be 


* This is most readily established (as I think T. E. Harris has pointed out) by observing that the 
population-size divided by the exponential of (Z — M) ¢ yields a positive martingale. 
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counted—#f (an unrealistic assumption) we grant that the individuals in a clone can be 
treated independently in this way. Bearing in mind that the experimenter will see not 
a complete clone but a two-dimensional section of one, it might seem better to replace n by 
the two-thirds power of n in formulae (1-2). Neyman reports that the formulae (1-2) do not 
fit very well, being not steep enough, but they have one immense theoretical advantage: 
they fit in admirably with the (inevitable) employment of generating functions. The two- 
thirds power law does not have this convenient property and will not be further considered 
here. In fact we shall work exclusively with the Neyman laws (1-2), so as to make the 
maximum of analytic progress. At a later stage this is a feature of the model which will 
probably require modification. 

Another modification which may be needed is the abandonment of the simple negative- 
exponential law, de <1r<a), (1-3) 
for the time intervals between the successive events (of course there is always a parameter 
to be inserted in (1-3) so as to give the correct mean time interval). It may be much better 
to use multiple-phase birth-and-death processes of the type investigated by Waugh (1955) 
and by myself (1952) to represent the growth of the grey and black clones. This, however, 
would greatly complicate the mathematics, and for the present purposes the model described 
above is the natural one to study. It is the simplest of the various models proposed by 
Neyman (1958) in his original formulation of the problem.* 


2. THE CASE OF A SINGLE GREY CLONE 


To start with we shall consider the history of a single grey clone descended from a single 
grey cell formed at the epocht = 0. At a subsequent epoch f, we will have say g, grey cells and 
B, black clones (of various sizes). Consider one of these black clones, formed say at the epoch wu 
(where 0 < u < t); it will have been growing for a period (¢ — u) and if its size is b then we shall 
have (formula (81) of Kendall, 1952) 


E{2} = me — tit a9 where 7 = exp[(L—M)(t-—w)]. (2-1) 





The chance that it will be counted, given that its size is b, is 1 — #”, and so (averaging over 
the b-distribution) the final chance that it will be counted is 
~~ BGs 

ut“) = pM) —LT 1p’ 
When (¢ — x) (the life-time of the black clone) tends to infinity then this chance tends to the 
limit g(00) = 1— _M/L (which is just the chance that the black clone will become infinitely 
large instead of becoming extinct). 

We now introduce the double generating function 


(2, w,t) = Efz%wh} (2-3) 
for the joint distribution of the number of grey cells (g,) and the number of detected black 
clones (B;). We easily obtain the following integral equation for the function ¢, by con- 

‘ sidering (a) whether or no an event occurs in (0, t), (b) the epoch u of the first event (if there is 
one), (c) the nature of the first event, and (d) what happens after the first event. We are here 


* For other stochastic models of tumour-formation see Arley & Iverson (1952) and also Armitage & 
Doll (1957). 


where 7' = exp[(L—M) (t—w)]. (2-2) 
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using Assumption (A). Weshall indicate the effect of employing Assumption (B) ina moment. 
If ¢,_,, denotes ¢(z, w,t—u), then the integral equation can be written* 


(e,w,t) = zexp[—+n+v)t]+ [ Ata + e+ of —(1—w)qlt—w)} 


xexp[—(A+ym+v)u]du. (2-4A) 
In writing out the integral equation we have also assumed that v is constant. If it were not 
constant, then vt in the first and vu in the second exponential would have to be replaced by 


fom dr, [' dr, 


respectively, and v within the braces would have to be replaced by v(u). This would be 
a tiresome complication, and to keep things as simple as possible we shall throughout the 
rest of this paper suppose v constant. (This means that either (i) we suppose that only the 
background can produce second-order mutations or (ii) we keep the experimental carcino- 
genic agent at a constant intensity throughout the experiment.) 

We now transform (2-4A)} by changing the time variable from u to 7 = t—u, by multi- 
plying through by the exponential of (A+ + ¥)t and by carrying out a differentiation with 
regard to t. We obtain 

Do = Ag? —(A+y+v)d+n4+r[1—-(1-—w) q(t)], (2°5A) 
with the side-condition 
(z, w, 0) = 2; (2-6 A) 
here D denotes d/dt. 

This is a differential equation of the Riccati type for ¢, but unfortunately it does not have 
constant coefficients. If we had used Assumption (B) instead of (A) then the term 
py[1 —(1 —w) q(t)] would have had an extra factor ¢, so that we should still have had a Riccati 
equation. In what follows we shall continue to assume both (A) and the constancy of v. 

Three things can be done with the equation (2-5 A) (apart from solving it—or the equiva- 
lent (2:-4A)—numerically). First of all we can easily find the joint moments of g, and B; by 


differentiating (2-5 A) as many times as may be appropriate with regard to z and w, and then : 


putting z = w = 1. The term ¢? and its successors in this procedure will all be linearized, and 
one obtains a sequence or rather a double sequence of linear differential equations for 
product moments of various orders for these two random variables. The process is quite 
elementary, but we will not carry it out here because we shall make no use of the results. 

Next, we can consider an approximation to (2-5 A) in which we replace q(t) by its limit 
q(«o) = 1— M/L (approached when t > 00). We then have 


Do = Ag*®—-(A+u+v)b+(u+rq), where g = 1—(1—w)q(oo). (2-7 A) 


Let us write 
A(p —¢) (¢—d) 

for the quadratic which occurs on the right-hand side of the approximating differential 
equation. We have so far used z and w as the customary complex arguments of generating 
functions, so that |z| < 1 and |w| < 1, but in fact we can just as well suppose that z and w are 

* The letter A attached to an equation number indicates that Assumption (A) was used in the 
derivation of that equation. 

+ If a grey cell is still reckoned as part of the grey clone after the cell’s ‘death’, then yu inside the 


braces in (2-4A) should be replaced by yz, and corresponding adjustments will be necessary in the 
following formulae. 
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real and satisfy 0 < z < land 0 < w < 1 (and let z and w approach | from below when we 
want to compute moments, etc.). If we do this, then the roots of the characteristic quadratic 
will be real and distinct and positive. Let us suppose for definiteness that 0 < c < d; of course 
c and d will depend on the variable w. The approximating differential equation then has the 


solution oa 
ee ea [—(d—c) At], 
and so when ¢ tends to infinity ¢(z, w,t) (or rather, our approximation) approaches the limit 
c(w), which is independent of z. 
It will be useful to record a few facts about the function c(w). Evidently it is given by 


o(0o) = sAt mtv + n+) —4A(u+ r)}, (284) 


where g = 1—(1—w)(1—M/L), and it is an elementary matter to expand this in powers of 
w by the binomial theorem. Notice that (2-8 A) does generate a probability distribution, for 
if w = 1 then the square root becomes (w+ v—A) and so c(1) = 1; also it is easy to verify that 
in the power-series expansion all the coefficients are positive. This suggests that our 
approximate calculation has an exact interpretation, and in fact this is so. 

Consider what happens efter a long time ¢t has elapsed. The grey clone will have become 
extinct, and before dying out it will have produced a certain number of ‘pioneer’ black cells, 
each of which will have generated a black clone and some of which will have generated black 
clones which themselves escape extinction. These latter black clones are the ones which will 
be counted by the experimenter (for they will each be very large in size). Accordingly from 
purely biological arguments we can see that, when f + 00, the function (2, w, t) will ulti- 
mately become independent of ¢ and of z, and will converge to a function of w alone which will 
generate the distribution of the total number of black clones which ultimately escape 
extinction. Next we observe that the equation (2-7 A), which we used before as an approxi- 
mation to (2-5.A), has the following exact interpretation; it is the differential equation which 
with the boundary condition (2-6 A) determines a generating function for the distribution of 
the number of grey cells g,, together with the number of black clones which were formed 
during the period (0,t) and which ultimately will escape extinction. It is now clear that, when 
t -> 00, the solutions to (2-5 A) and (2-7) (in each case with (2-6 A) as a side condition) will 
converge to the sam« limit. Thys, as an exact result, we have the very useful formula 


lim $(z,w,t) =c(w) (O<z<1, 0<w< 1). (2-9 A) 
t>«o 


An analytical proof of this would also be possible, starting with the integral equation (2-4 A) 
and proceeding as in § 4 of Waugh’s paper (1955), but the present method is adequate for 
our purposes and is much more illuminating. 


3. THE GENERAL PROBLEM 


In the last section we confined attention to a single grey clone; we must now consider the 
formation of the grey clones and the ensemble of black clones which are derived from them. 
We start by thinking of the events which follow from the possibility of a grey cell being 
formed in the time interval (w, u+du), where 0 < u < ¢ and (0, ¢) is the interval of exposure. 


A grey cell will be formed then with a probability f(u) du. Suppose that one ts formed, and 


2 Biom. 47 
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that at the end of the experiment it yiclds g grey cells and B’ detected black clones. Suppose 
that it yields G’ detected grey clones, where G’ = 0 or 1. Then 
EO w® |g, BY = (1—y*) zwB + yow®’, 
and so 
E{z"w®} = dy, w,t—u) +2[A(1, w, t—u) — Ply, w,t—u)], (3-1) 


and so the contribution to the double generating function for the distribution of the numbers 
of grey and black clones formed during (0,¢) and detected at the epoch t, due to the possi- 
bility of a grey cell being formed during (uw, w+ du), is 


1—f(u)du+f(u) du E{z"w*?} + o(du), 


and to find the generating function for the joint distribution of the total numbers X, and Y, 
of detected grey and black clones, we have to form the product of such expressions for each 
sub-interval (wu, «w+ dw) in the interval of observation. Thus, finally, we get 


Y(z, w, t) = B{zXw¥} = exp[—R(w,t)—(1—z) S(u, t)], (3-2) 
where 


R(w, t) = | (1-901, 0,t-wy] fou, | 
: : (3:3) 
S(w,t) = [sa. w,t—u)—(y, w,t—u)] f(u) du. | 


The direct use of these formulae is hardly practicable, because we do not have explicit 
formulae for the function ¢, but we can obtain useful and interesting results of two quite 
different types by pursuing the argument a little further. 


First, let us calculate the means, variances and covariance of the random variables X, and 
Y,. We find that 


B{X} = S(1,t) = i) o — $y, 1,t—u)] flu) du; (3-4) 


t 
mY} = —F-RU,t) = [2-40 1yt—w) flu) du; (35) 


. 


E{XF} = [S(1, t)]}? + S(1, 6); 


var (X,) = E{X}}; (3-6) 
d 2 @ 
E{Y}} = las Ba.0| — 5 R(t) © Rid); 
d2 d t 2 
var (Y,) = - (sat au) R(1, t) -| |aatael (1, 1,t—w) f(u) du; (3-7) 
d td 
cov (Xn¥) = 7, 8() = f F(A tw) — oy, tw} flwdu. (38) 


In order to use these formulae* we need to know ¢(y, 1,¢), the first and second w-deriva- 
tives of A(1, w, t) at w = 1, and the first w-derivative of d(y, w, t) at w = 1. We can calculate 


* In all these formulae the w-differentiations are to be carried out before w is made equal to unity. 
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these (assuming (A) and the constancy of v) by going back to equations (2-5-6 A). Let us 
write ¢’, d” for the first and second w-derivatives of ¢. Then we shall have 
Dg’ = 2APG' —(A+pu+v) $' + vg(t), 
DG" = 2APH" + 2AP™—(A+u+v) 9", 
and 
$(z,1,0) =z, '(z,1,0)=0, and ¢"(z,1,0)=0. 
Now put j(t) = A(y, 1,t); h(t) = ¢’(1, 1,4); h(t) = $"(1, 1,8); and k(t) = ¢’(y, 1,t). Then j(t) 
is the solution to the differential equation 
Dj = AP? —(A+e+v)j+ (4+), j(0) =, 
and so it is given by 
Ht) 8 _ ¥-8 own 
jt)-8' ys" 
where s and s’ are the roots of the quadratic equation 
As*—(A+ut+v)s+(u+v) = 0; 
that is, s = 1, s’ = (w+ v)/A, and 





: —A\ 2 Att — m 
q(t) = 1- (-") 3 ( ae) exp[—m(u+v—A)#t]. (3-9.A) 


The function q(t) is as before given by (2-2), and h,(t), h(t), and k(t) can now be determined as 

the solutions to the linear first-order differential equations 
Dhy+(u+v—A)hy = vgit), ~ hy (0) = 0; 
Dk +(w+v+A—2Aj)k = vqit), &(0) = 0. 

In this way we can determine all the auxiliary functions needed if (3-4)—(3-8) are to be used in 

the calculation of the first- and second-order moments of (X,, Y,). 

We now return to the basic formula (3-2) and observe that the marginal distributions of 
X, and Y, are generated by 


E{eXi} = exp[(z—1) S(1,t)] E{w¥4 = exp[—R(w,#)], (3-11) 


so that X, has a Poisson distribution with mean S(1,¢). With the usual assumptions, this 
reduces to 


(3-10.A) 


B(K} = [ 11-Jt—w1 sua, (3-12) 


where j(t) is given by (3-9). 

It is not so easy to draw immediate conclusions from the generating function for the 
marginal distribution of Y,, because it depends on a ¢-function which we have not been able 
to determine in closed form. However, let us make Assumption (A), let us further assume 
that the ‘feeding’ function f is constant throughout the period of exposure (0, ¢), and let us 
examine the moments of (X,, Y,) when the period of exposure t is large. We then easily obtain 
the following asymptotic formulae 


om ent) ' 
lim FIX = x log PER (3-13 A) 

; _ fo(1—M/L) , 
lim BAY}I¢ oa (3-14. A) 


2-2 
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’ _ fv l-—M/L) , 2Afv(1—M/L)? : 
= ()/t = utv—A . (u4t+v—Ap ’ ems) 
, 2A * : 
eomtuaed (X,,¥;) = a of [1 —9(7)] k(7) dr < 0. (3-16 A) 


Only the last of these results requires any explanation. From (3-8) we know that 
t 
cov (XH) =F (alr) kr) dr, 


and from (3-10 A) we know that r(t) = h,(¢) — k(t) satisfies the differential equation 
Dr+(u+v—A)r = 2A(1—-j)k,  7(0) = 0. 

Now j(00) = 1, k(00) = c’(1) = h, (00), and so* r(oo) = 0. Thus, integrating the last differentia! 

equation, we see that 


oe) 2r oo) Ps 
| r(t) dr = anal, [1 —9(7)] k(r) dr, 


0 


the integral being convergent. It will be noticed that the covariance is positive. 

In conclusion it should be noted that all our formulae will require adjustment if grey or 
black cells are already present in the system at the epoch ¢ = 0. It will by now be clear how 
such adjustments can be made. 


4. THE SO-CALLED ‘THRESHOLD’ EFFECT 


Neyman (1958) and Neyman & Scott (1959) are at variance with some earlier writers on 
the ‘two-hit’ theory of carcinogenesis, in that they do not agree with the existence of what 
has (misleadingly) been called a ‘threshold’; to be more specific, Neyman and Scott conclude 
that the effects of the carcinogen are in direct proportion to its intensity. It therefore seems 
appropriate to offer a comment on this apparent disagreement. The relevant formula is 
(314A), which gives the expected rate of incidence of malignant tumours for continued 
(long) exposure to a carcinogenic agent maintained at a constant intensity f. (Formula 
(3-14A) relates in the first instance to the number of detecied tumours, but it also applies to 
the total number B, of tumours, as may be seen from the fact that it does not involve the 
detection-parameter /.) We must now distinguish two possibilities: 

(i) The second-order mutations occur at a rate independent of the carcinogen, so that v is 
an absolute constant. This is the assumption made by Neyman & Scott, and it makes the 
incidence of tumours strictly proportional to the intensity f, as they reported in their papers. 

(ii) The second-order mutations occur at a rate proportional to the intensity of the 
carcinogenic agent, so that v = vy f, where v, is an absolute constant. Then 


lim E{B}/t = - i ial vo(1— ML), (4:1) 
t>o 0 ray 


and so now the incidence of tumours is for low intensities (i.e. f < (~ —A)/v_) proportional to 
the intensity squared, and it becomes proportional to the first power of the intensity only 
when the latter is large. 

Thus the second assumption implies a difference in the character of the response at low 
and high intensities, and so it could be said that in this case there is a sort of threshold. But 
the use of this term is unfortunate because the transition is not at all abrupt. 


* Tt is easily verified that c’(1) is finite. 
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The quantal response analysis of a series of biological assays 
on the same subjects 


By J. R. ASHFORD,* C. 8. SMITH anp SUSANNAH BROWN 
National Coal Board, Pneumoconiosis Field Research 


SuMMARY 


This paper deals with some problems associated with the analysis of the results of a series 
of separate biological assays on the same group of subjects, the reaction of the individual 
organism to the applied stimulus being assumed to be non-reversible. A brief description is 
given of the methods applied in the analysis of the results of single assays both for single 
poisons and for mixtures of poisons, with particular reference to the effect of errors of 
measurement of the observed quantal response. These methods are extended to cover 
a series of two or more assays, first, under conditions in which the response is measured 
without error, and secondly, when the response measurement is subject to error. The 


application of the proposed procedures is illustrated by an example from the field of research 
in industrial medicine. 


1. InTRODUCTION 


As a general rule the subjects selected for testing in a biological assay have not previously 
been exposed to the material under investigation, nor to any other material which affects the 
same physiological system. When the reaction of the individual organism to the applied 
stimulus is non-reversible there are, however, some circumstances in which it is convenient 
to make use of the same subjects for repeated experimentation. Situations of this kind are 
encountered in studies of human populations, for example, when periodic medical examina- 
tions are carried out on groups of men exposed to some health hazard which is a function of 
time. 

This paper is concerned with methods of analysing the results of a series of two or more 
separate assays on the same group of subjects. In the applications considered each assay 
involves the same poison or, alternatively, several poisons are applied on each assay, 
provided they all affect the same physiological system (i.e. under conditions of ‘similar joint 
action’). The reaction of the subject to the applied stimulus is assumed to be non-reversible 
and consideration is given to the situation in which the observed quantal response is subject 
to errors of measurement. The effect of such errors is particularly important when dealing 
with more than one assay, since errors can lead to a response being observed on an earlier 
assay but not on a later one. 

There are numerous references in the literature to methods for analysing the results of 
biological assays in which the reaction of the individual organism (measured without error) 
depends on the time elapsing after the poison was administered, as well as on the dose of the 
poison (Finney, 1952). White & Graca (1958) have described an application of this kind in 
which each organism was examined at a number of pre-specified times after the poison was 
applied, and show how the relationship between the probability that the quantal response 


* Now with U.K. Atomic Energy Authority, Winfrith, Dorchester, Dorset. 
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will be observed and the time at which the examination was carried out may be derived. The 
situation considered in this paper may be regarded as a generalization of that described by 
White & Graca, in the sense that the effect of time between successive examinations 
following the initial administration of the poison is equivalent to a special case of the 
application of a number of doses of another poison in a series of assays on the same subjects. 
White & Graca do not consider the complications caused by errors in the measurement of the 
response. 
2. THE ANALYSIS OF SINGLE ASSAYS 


The statistical methods for analysing quantal response data arising from assays on sub- 
jects with no previous exposure to the material under investigation are well established 
(Finney, 1952). The approach normally adopted is based on the concept of the ‘tolerance’ of 
the individual organism, which is defined as the dose which would be just sufficient to pro- 
duce the quantal response and may be expected to show some variation from subject to 
subject. It is usual to assume that there exists a transformation, say f(x), of the applied 
dosage x such that the distribution of tolerances in the population under consideration, 
denoted ${f(x)}, is symmetrical, with zero mean and unit variance and covering the whole 
range of values of f(x) between —oo and +00. Under these circumstances the probability of 


response at dose x may be written fe) 


P=) ¢(0)d0, (1) 


where f(x) is termed the ‘equivalent deviation’. Various mathematical forms have been 
proposed to represent the tolerance distribution, of which the most commonly used in 
practical applications are the standard normal and logistic distributions. 

When a single poison is applied it is commonly found that the equivalent deviation may be 
represented by a linear function of the logarithm of the dose, of the form 


f(x) = a+bloga, (2) 


where a and b characterize the response of the population to the particular poison. If the test 
subjects are exposed to a mixture of different poisons which affect the same physiological 
system the concept of the tolerance distribution may still be applied. When a mixture of 
w different poisons X,, X,, ..., X,,is applied at dose (x1, Xp, ..., X,,), the equivalent deviation is 
a function of the applied dose, say f,,(7,, 2, ...,Z,) (Plackett & Hewlett, 1952). If the various 
poisons making up the mixture do not interact (i.e. under conditions of ‘simple similar 
action’) it may be shown (Ashford, 1958a) that the equivalent deviation may be written 


w 
SA&y Lg, -.->Ly) = 8, log, = jiavtby oe. ‘ (3) 
y= 


where the parameters a,, b, characterize the action of the poison X, when applied singly. If 
the poisons interact the equivalent deviation will in general be of a more complicated form 
than (3), depending on the nature of the interaction. 

If the test subjects are assigned at random to k groups and the ith group includes n; 
subjects, of which r; manifest the characteristic response when the group is exposed to 
a mixture of w poisons X,, X»,...,X,, applied at dose (x;1, Xj, ..., %j,,), the logarithm of the 
likelihood of any given set of observations may be written as 


k 
L = constant + © [r;log P+ (n;—1,) log (1— B)], (4) 
i=] 
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fi 
where P, = i) ¢(0) dé [from (1)] and f; is a function of the applied dose, and of a number of 


unknown parameters w, v, .... ‘The maximum likelihood estimates of these parameters may 
be calculated directly or by the solution of a series of equations of the form, 


oL £ n(p;- a (Zi) <0 


au 2, P(l—B) 6) 


where p; = 1,/n; is the observed proportion of responses. The asymptotic variances and 
covariances of the maximum likelihood estimates may be evaluated in the usual way and 


ro [cov (@,0)] = |-¥ (seas) | pi [ BP) (5x) (sl m 


The preceding analysis is based on the assumption that the quantal response measure- 
ment is not subject to error. There are, however, some circumstances in which this assump- 
tion is not justifiable. For example, when the observed quantal response corresponds to 
a subdivision of an underlying quantitative response scale and the subject is classified as 
‘responding’ or ‘not responding’ according to whether or not the underlying reaction is in 
excess of some specified value, it is commonly found that the quantitative measurement is 
subject to errors, which in turn give rise to corresponding errors in the measurement of the 
quantal response. In the case of similar joint action it is reasonable to assume that when 
a mixture of w different poisons is applied at dose (2, %»,...,%,,) the true quantitative 
response y of any particular subject may be expressed in the form 


Y = 2(%1, Lq, ..., Ly) +E, (7) 


where € depends only on the subject and is independent of the applied dose. We shall assume 
further that ¢ is normally distributed in the population of subjects with zero mean and 
variance o? and that the observed value y’ of the quantitative response to any given dose is 
normally distributed about the true response y, with bias A and variance p?, where A and p 
characterize the error distribution. In these circumstances the observed response values y’ 
are normally distributed, with mean (z+A) and variance (07+ 7). If the characteristic 
quantal response is observed when y’ > Y,* the probability of this event may be written 


od l {_1(¢-z- A)"' 
PU Y)= J amare P| 2 otrpt |” 
- |" Ciereon Jam Pt 40" d0 (8) 
2+A—Y 
= 0 (Ta) 


Comparison with equation (1) shows that the equivalent deviation corresponding to the 
response being observed may be written 
_2+a-Y 
fo or +p) 


In the case of a single poison it may be shown (Ashford, 1959@) that when the ‘true’ quantal 
dosage response relationship is linear the corresponding ‘apparent’ dosage response relation- 


(9) 


* This approach may be extended to cover the case of a semi-quantal response (i.e. y < ¥,, 
Y,<y < Y,, y= Y,, etc.). (See Ashford, 1959a.) 
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ship is also linear, although the slope of the regression line is reduced. Equation (9) shows 
how this conclusion may be extended to mixtures of poisons. Thus, the relationship between 
the probability of response being observed (as opposed to definitely taking place) may be 
described by an expression of the form (1) and the same methods of analysis may be applied 
whether or not the response measurement is subject to error. A similar remark applies even 
when there is no underlying quantitative response, provided the errors are reasonably small 
and expression (1) holds good to a sufficient degree of approximation. 


3. REPEATED ASSAYS WHEN THE RESPONSE MEASUREMENT IS MADE WITHOUT ERROR 


It will be assumed that a succession of w separate assays of poisons X,, X5, ..., X,,, [some 
or all of which may be the same] is carried out on a batch of subjects, the poison applied on 
the jth assay being denoted X;. Each poison affects the same physiological system and leads 
to the same characteristic quantal response, which is non-reversible and is not subject to 
errors of observation. The test subjects are assigned at random to & groups and the ith group 
is exposed to a dose x,; of the poison X; on the jth assay. The total number of subjects in the 
ith group is denoted n,; and the number who have manifested the characteristic response by 
the end of the jth assay (i.e. as a result of the Ist, 2nd, ..., and jth assays) is denoted 
r4j(1 <j <w). The number not manifesting the characteristic response after the last 
wth assay is denoted s,,,,,. As the response measurement is not subject to error it may be 
asserted that rj;.,) > 7;; for j = 1,2, ...,(w—1). 

On the jth assay the probability of response will depend on the doses of the various 
poisons applied on the jth and previous assays and, for a subject belonging to the ith group, 
is given from (1) by the expression 


Sij 
Py= |! (ode, (10) 


where /;; is a function of the dose (x,;, %;, ..., %;;) and of a number of unknown parameters. 
The probability of response as a result of the jth assay (i.e. no response after the (j —1)th 
assay but response after the jth assay) may be written 


Qi; = 14 — Pig-v> (11) 
j = 1,2,...,w, and the probability of no response after the last assay may be written 
Vien = 1— Pew. (12) 


The number of subjects responding asa result of the jth assay may be written s,; = [7;;—rj_»] 
and the probability of s;; responses in the ith group as a result of the jth assay [j = 1, 2, ..., w] 
is given by the multinomial distribution, by the expression 


(w+1) (w+1) 
P(8,;) = ns! au su!l| Il vl. (13) 


as there are (w+ 1) mutually exclusive possibilities [any subject may respond as a result of 
the Ist, 2nd, ..., wth assays or may not respond even after the wth assay]. Thus, the logaritim 
of the likelihood of any particular set of observations may be written 


w 


k (w+) 
L=constant+ > > s8,;logQ;;. (14) 
1 


i=1 j= 
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If u, v... denote parameters contained in Q;; the corresponding maximum likelihood esti- 
mates are given by the solution of a series of equations of the form 








aL & wth g,, Qu) 
oe Sig (CVI) _ 15 
Ow 4=1 §=1 a: ( ou () 
The asymptotic variances and covariances of these estimates are given by the expression 
mir OL -1 ia k (w+l1) ] 0Q;; 00; 7 
ov 9) =|—B(sa)] =[2% gy (Be) (Ge)] ” 


4. REPEATED ASSAYS WHEN THE RESPONSE MEASUREMENT IS SUBJECT TO ERROR 


If the response measurement is subject to error it is possible that a subject may be classi- 
fied as showing evidence of response if in fact the response has not taken place. Similarly, the 
response may not be observed even although it has actually occurred. When a series of 
assays is considered the fact that a subject may appear to have responded after one particular 
assay does not necessarily imply that the subject will appear to respond on subsequent 
assays. The results of each assay must therefore be considered separately and the methods 
described in the previous section are not applicable. Further treatment of the problem must 
depend on the assumed form of the error distribution and consideration will be given to the 
situation described in §2, which is probably the most important occurring in practical 
applications. The general methods of approach may, however, be extended to cover other 
assumptions. 

If two successive assays of mixtures of w different poisons (some of which may be at zero 
dose) are carried out on the same group of subjects, suppose that the resulting true quantita- 
tive responses are denoted y, and y, and the corresponding observed quantitative responses 
are denoted y; and y;. On the first assay the doses are (a{), 2”, ...,2) and that after the 
second assay the total stimulus applied (on both assays) comprises doses (2, 2), ..., 2). 
Following the notation in § 2, let z, and z, denote the corresponding values of z on the first 
and second assays, and let the means of the corresponding error distributions be denoted 
A,, Ag and the variances pj, /3. 

As a result of the two assays each subject may be assigned to one of four classes, according 
to whether or not the quantal response was observed on each assay. 

Class 1: y; < Y,y; < Y. No response on either assay. 

Class 2: y, < Y,y; > Y. No response on first assay, but response on second. 

Class 3: y, > Y,y, < Y. Response on first assay, no response on second. 

Class 4: y; > Y,y; > Y. Response on both assays. 

Let T;,, T;2, T;3, T,4 denote the probabilities that a man belonging to the ith group is assigned 


t 


to one of these four classes. If p(y}, y3) is the joint probability distribution of y; and y; for 


given doses, we have Y Y 
Ty = | | p dy; dy2, (17) 
" ¥ , , 
Ti. = lJ pdy; dys, (18) 
ad = , , 
Ts = | [ pay, dy>, (19) 


T, = | | pdy, dy’, (20) 
YJY 
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Now, from (7) the observed quantitative responses may be expressed in the form 

Y= amteth (21) 
and Yo = %t+e+§s, (22) 
where 2, > 21, € ~ N(0, 07) and £; ~ N(A,, p?), i = 1,2. 


Since ¢, £,, and &, are independent the joint distribution of (y}, y2) is bivariate normal and 
aus cov (yi, ¥:) = 0%, vary; =o*+pi, var yz = 07 +p}. 
For convenience of presentation we shall assume that the same error distributions apply on 
both assays, so that ee ee ee 
The correlation coefficient is thus t = o?/(o7 + p”), and the means of y,, y. are z, and 2,. 

If t,;, tig, tig and t;, are respectively the numbers of men in the ith group falling into each of 
the four classes defined above the likelihood function will take the form 


k 
L = ¥ (ty log Ty + tz log Typ + tig log T+ t;4 log T;). (23) 
i=1 


In general, the expression (23) is a very complicated function of o, p and of the parameters 
contained in z, and z,, but when there are no errors of measurement (i.e. p = 0) it may readily 
be shown that it reduces to the form (14). 

In practice p? may be large compared with o?. Using a result proved in Kendall (1943, 
p. 355) the integrals (17)—-(20) may be expressed as power series of t = o?/(o? + p?) as follows 


Ty = FOU, ONU,),, (24) 
v=0 ‘ 
Tig = (U2) — Ts, (25) 
T,3 = O(0,) — Tis, (26) 
T,, = 1—O(U,) — O(U,) + Ty, (27) 
2,—-Y 


¥] 


Note that (9) shows that U; are the values of f,,, the equivalent deviation. 
If o?/p? is so small that ¢ may be neglected, we have 


L = Siti log ({1 — O(U,)} {1 — O(U)}) + tg log (®(U) {1 — ®(0,)}) 
+t,slog (®(U,) {1 — O(U,)}) + ti, log {O(U,) O(0,)}] 
= LD [tin + tig) log {1 — O(U,)} + (t+ tg) log {1 — O(,)} 
+ (tig +t;4) log O(U,) + (tg +t) log B(U,)], (28) 


where = 1,2). 


where U; = (z;— Y)/p. 

If o2/p2 is not negligible, further terms of the power series (24) must be used, and L must 
then be regarded as a function of ¢ and of the parameters contained in f,,. The maximum 
likelihood estimates of these parameters may be obtained in the usual way and the asymp- 
totic variances and covariances of these estimates take the form 


coven =|—B heres] LEZ ba (e) (Ge) em 
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This method of approach may be extended to cover the analysis of the results of three or 
more assays on the same subjects, although the computations involved become progressively 
more complex. 


5. A PRACTICAL APPLICATION 


The practical application of the procedures described above may be illustrated by an 
example from the field of research into the causes of pneumoconiosis amongst coal miners. 
As part of the National Coal Board’s Pneumoconiosis Field Research (Fay, 1957) a series of 
radiological surveys is being carried out at a number of collieries at intervals of about 
4 years. At each of the selected collieries a continuous programme of environmental 
measurement (Ashford, 19586) provides an estimate of known accuracy of the exposure to 
airborne dust between successive radiological surveys of each man belonging to the colliery 
population. 

The classification of the X-ray films obtained on the radiological surveys may be regarded 
for the purposes of this analysis as taking the form of a quantal response—on any particular 
survey the man may or may not appear to have pneumoconiosis. The film-reading process is 
subject to errors, which may conveniently be regarded as homoscedastic and normally 
distributed on an underlying subjective abnormality scale (Ashford, 19596). The develop- 
ment of the condition (as defined in these terms) is believed to be a non-reversible process. 

At the time of the first radiological surveys the majority of the men examined have 
a record of environmental exposure extending back to the time when they joined the mining 
industry. In general there is no specific information about environmental conditions prior 
to the first radiological surveys, but it has been found that an indirect measure of previous 
exposure may be obtained in terms of the period spent in one or more types of general 
environment (e.g. coal face, coal-getting shift). The individual exposures since the time of 
the first radiological surveys may be obtained directly, by reference to the results of the 
systematic programmes of environmental measurement. The man’s ‘dosage’ at the time of 
the first radiological survey is therefore given by the period spent previously in one or more 
classes of general environment and his ‘dosage’ at the time of the second and subsequent 
surveys consists of two distinct components, corresponding to the ‘measured’ exposure since 
the first survey and the previous ‘unmeasured’ exposure. On general grounds it may be 
assumed that there can be no interaction between the ‘measured’ and ‘unmeasured’ com- 
ponents of exposure and that the effect is the same as that of a mixture of two poisons under 
conditions of simple similar action, in the convent’onal terms of biological assay. 

After the second medical survey at a colliery the situation is therefore equivalent to two 
successive assays on the same group of subjects.There are tvo possible ways in which the 
quantal response may be measured. The first of these is to classify simultaneously the pair of 
X-ray films taken of each individual on the two surveys. Because of differences between 
surveys, due to various causes (such as technical quality), the film reader will be aware of 
which film was taken on the first survey and which on the second. He will therefore read 
only ‘no response’ on both surveys, ‘no response’ on the first survey and ‘response’ on the 
second, or ‘response’ on both surveys, and will exclude the possibility of ‘response’ on the 
first survey but ‘no response’ on the second. The situation is therefore similar to that 
considered in § 3, the effect of errors of reading being merely to reduce the slope of the quantal 
response line. Alternatively, the films obtained on successive surveys may be read inde- 
pendently. In these circumstances the reading of ‘response’ on the first survey and ‘no 
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response’ on the second is not excluded and the situation is similar to that described in § 4. 

For the purposes of illustration only the first (side-by-side) method of classification will be 
considered further. The data obtained at one particular colliery at which this method was 
applied are given at Table 1. 


Table 1. Prevalence of pneumoconiosis on two successive surveys 
and corresponding environmental exposure 
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7-00 527 3 «(OO 0-6 0 0:2 3 2-2 
32-51 178 6 5 4-5 0 0-2 1 1:3 
26-48 1104 8 612 55+147 1) 9 0-4: 57 1569 221) 696 

la 1-1 1 0:6 2 2:3 
11:86 134 7 0 2-7 5 0-9 2 3-4 

0-58 13855 54 0 0:3 2 34 52 50-3 
1629 1367 7 © 3) 3:5) 0 0-8) 4 2-7) 

3-58 1432 12 3| 1-1 0 1-7 9 9-3 
21:98 1452 5 3310 31$ 86 1+ 4 o4' 45 abig 1-5) 19-9 

0-40 2182 5 0| 0-0 1 0-6 4 4:3 

750 402248 C4 0-9) 2 1-0 1 21 

940 3047 5 3 1-5 0 1:5) 2 2-0) 

2:31 3082 8 0| 0-4 1 | 2-5 7 Bl | 
20-51 3250 2 oo} 6 12) 68 of} 4 o44 73 2b15 o5tie 
26-98 3372 4 3 | 2-8 0| 0-5 1 07] 

0-67 4363 6 Oj! 0-0 3 2-4! 3 3-5) 

31:99 4414 2 2 1-5 0 0:3 0 0:3) 

867 4597 3 1 0:8 2 3 0 | 
1950 4665 2 1 il 1 Pe 0-4 
200 «64781 1 CO toa oof! — ges 2) gt loa 88 

188 5115 6 0 0-2 4 3-1 2 2-7 
13-84 5940 13 6 57 4 4-9 3 2-4 

188 6065 45 1 15 30 27:0) 14 16-5 
33:96 6079 4 38 3-0 1 0:6 0 0-4 

7:28 #6178 #lls 5 251... 2 5-8 4 2-7 
eis 6178 5 2! gap '*? 4795 497360 9722 © ggs 200 
20-42 «6249 «5s 2-9 1 1-4 1 0-7 
41-02 6249 1 0 0-8 0 0-1 1 0-1 
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Xj, = period (years) on the coal face, coal-getting shift, prior to first survey, weighted means. 
Xj_ = dust exposure between first and second surveys, weighted means. 

n, = total number of men in group. 

8,, = number of men with pneumoconiosis of first survey. 

8;. = number of men developing pneumoconiosis between first and second surveys. 

8,3 = number of men without pneumoconiosis on both surveys. 


At this colliery the ‘unmeasured’ environmental exposure prior to the first survey is ex- 
pressed in terms of the period spent on the coal face, coal-getting shift (denoted z,) and the 
‘measured’ exposure between the first and second surveys is expressed in terms of arbitrary 
units of dust exposure (denoted 2,). 
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4. Estimates of the five parameters a,, b,, d,, b, and @ associated with the response of the 
be population to the ‘unmeasured’ and ‘measured’ components of exposure were obtained by 
as the solution of equations (15), by the Newton—Raphson procedure. The tolerance distribu- 


tion was assumed to be logistic in form and the values of P,; and its partial derivatives were 
found using the expression, 

OF; _ (Fs) (Fis) _ fs 

a= (ap) (Gu) = PtP (FE). (30) 
where the w’s are the parameters and the f’s are given by equations (2) and (3). 


The initial approximations for @,, 6, a2, and 6, were obtained by reference to the marginal 
distributions of x, and x,. The final values of these estimates were as follows 


@, = —4:32, 6, = 3-57, @ = —10-20, 6,=5-53 and 6 = 5-22. 
The variances of these estimates, obtained by the application of equation (16) are as follows 
var@, = 0-361, varb, = 0-276, var @, = 5-975, 
varb, = 1-498, var = 1-766. 
The covariances of corresponding values of @, and 6, are 
cov (@,,6,) = —0-298 and cov (@,,6,) = —2-928. 


The hazard relating to either of the two components of exposure may conveniently be 
expressed in terms of the ED 50—the dose at which 50 % of the population may be expected 
to manifest the response if the men concerned were exposed only to the given component. If 
this quantity is denoted d and D = logd, we have, from (2) 


D =logd = —A/b. 


From Fieller’s theorem (Finney, 1952, p. 29) the 95 % fiducial limits for the ED 50’s are 
found to be 7 
D, = 1-21040-1175, ie.d,=16 (2i years, 12 years); 
and D, = 1-844+40-1880, ice. d,= 70 (108 years, 45 years). 


The fit of the observations to the hypothesis on which the analysis is based may be 
examined in Table 1, which gives details of both the ‘observed’ and ‘expected’ numbers of 
men showing evidence of pneumoconiosis. The agreement between the two sets of figures is 
very good and there is no evidence of any systematic difference between them. The value of 
x?, after grouping certain of the entries for which the expected numbers are small, is 5-777, 
which for 7 D.F. is not significant (p + 58%). 

Similar data are being obtained at other collieries, both by means of independent and 
side-by-side readings. The calculations involved are extremely complex and suitable elec- 
tronic computer programmes are in course of preparation. 
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Some consequences of superimposed error in time 
series analysis 


By A. M. WALKER 
Statistical Laboratory, University of Cambridge 


1. INTRODUCTION 


It is well known that time series analysis usually becomes much more complicated when 
it is necessary to take account of superimposed observational error. However, although it 
is now many years since attention was first drawn to the problem of superimposed error 
(this has, for example, been discussed by Kendall (1944) and Quenouille (1947) with par- 
ticular reference to series generated by a second-order autoregressive process), a detailed 
study of the complications which can occur does not yet seem to have been made. In this 
paper some of the problems that might be considered in such a study are formulated, and 
the difficulties that arise in attempting to solve these are discussed. Unfortunately it would 
appear that little progress is possible without very laborious algebraic manipulations— 
this, together with the fact that the mathematics involved is rather uninteresting, may 
account for the neglect of the topic by writers on the theory of time series analysis. Never- 
theless, a few specific results are obtained, and it is hoped that these may stimulate further 
investigations which should at any rate be of practical importance even if unrewarding in 
their theoretical aspects. 

It will be assumed, as is customary, that the errors are additive and form an independent 
stationary series which is independent of the original time series. Thus if 2, denotes the 
observation at time ¢ 

Uy = Ut My (1) 


where (7,) is a completely random series, i.e. a set of independent random variables with 
a common distribution (whose mean may be taken as zero without loss of generality) and 
is independent of the stationary series (u,). It will also be assumed that the values of ¢ are 
equally spaced, so that they may be taken to be integers without loss of generality, and 
that (u,) is generated by a linear autoregressive }-: 0cess, i.e. is the stationary solution of an 
equation of the form 

Uj + Ay Wy t+... FAy Uy = Up (2) 


where (v,) is an independent stationary series, and the constants a; are such that the moduli 
of the roots of the equation z? + a,z?-!+...+a, = 0 are less than unity. The condition on 
(u,) is less restrictive than might appear at first sight, since by taking p to be sufficiently 
large one can obtain an adequate approximation to most linear processes, which are such 
oo 
that u, = 5 J,Y,-» Where > g2 is finite (this is certainly true if G(z) = 2 917 has radius of 
r=0 r=0 r= 
convergence greater than unity and has no zeros inside or on the unit circle, so that [G(z)]“ 
is analytic with a power series expansion for |z| < 1). 
Finally, the data will be assumed to consist of values of x for n consecutive values of t, 
which we denote by 2, %9, ..., py. 
3 Biom. 47 
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2. ESTIMATION OF PARAMETERS 
From (1) and (2) 


Uy t Ay Mit ee FAX py = WTA Mat --- FAy My FY 
=Y say, (3) 
so that (2,) is generated by a linear autoregressive process with residuals y, which form a 
p-dependent stationary process, y, being independent of y,_, only for values of r > p. The 
correlation between y, and y,_, (r < p) is a function of all the parameters aj, dg, ...,a,, as 
well as of the ratio A = o’?/o?, where o’? = var ,, o? = var v,; it is therefore not surprising 
that the problem of estimating a,, a, ...,a,, should be much less straightforward than when 
there is no superimposed error. 
Suppose now that the processes (w,), (,) and therefore (z,) are normal. The behaviour 
of (z,) is then completely determined by its autocovariance function y,(s) = E(x,x;,,,) [we 
may take H(z,) to be zero without loss of generality], or its spectral density 


Ry ; 
f,(0) iad On, ei Ya(8)- 
It is easily seen that 
2nf,(w) = o'2 + 07{(i +a, e+... +a, eP*) (1+a,e- +... +a, e-)} 1 (4) 
and that Yel) =Yy(8) (8>0), — y_(0) = %4(0) +0", 
so that the effect of the superimposed error is to reduce all the autocorrelations of the 
series in the same ratio, multiplying them by the factor {1 + (a’?/o?)}-, where o2 = var w,. 


The most obvious way in which to obtain estimates of the complete set of parameters 
occurring in the autocovariance function, i.e. a, dg, ...,4,, o (or o?) and o’?, would be to 


n—s 
equate the first p+2 sample serial covariances C, = > %,%1,,4/(n — 8) (s = 0,1,...,9+1) to 
t=1 


their expectations y,(s). [If Z(x,) is not known a priori one may modify these in the usual 
way, replacing x, by x,—%, where % is the sample mean.] This would give estimation 
equations As 

? C; = Pult) 





=o t= 2,3,...,p+1), 
Cy p,(1) ( Ale (5) 


C; = Fi Py(1), Co = +6", 


where, if p,,(t) = f;(a1, 4, ...,@,) is the ith autocorrelation of the series (u,) expressed as a 
function of @;, ds, ...,@, (which will be the ratio of two polynomials), /,,(i) = f,(@,, @, ..., @,). 

The estimates obtained from (5) will certainly be consistent and have asymptotic stan- 
dard errors which are O(n-+) (n — 00). In fact the limiting joint distribution of 


n'(@,—a,), nd(@,—a,), ..., n4(@,—a,), nt(62—-o0%), nb(6’2—-o'2) 


is multivariate normal with zero means and finite covariance matrix; this follows from 
the asymptotic normality of (Co, C,, ...,C,,,,) which is easily established (see, for example, 
Walker, 1954). 

There is, however, the disadvantage that the method involves the solution of an algebraic 
equation which appears to be in general very complicated. Thus @,,@., ...,@,, are obtained 
from the first p equations of (5) and since 


P,(t)+a,p,(i—1)+... +d,p,(i—p)=9 (t= 2,3,...,p+1), 


th 
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these are equivalent to the set 
Cyi1+4,C, + eos +4,C, => 0, 


6 
C,+@,C,_1+...+(@;C,/p,(1))+...+@,C;,, =0 (i = 2,3, Sl (8) 


(where for s < 0, C, is defined as C_,), which determine @,, @., ...,@, a8 ratios of two poly- 
nomials in [,,(1)]-*. Substitution of these polynomials in /,(@,, @, ...,@,) will then give an 
equation for f,,(1). Of course (6) suggests a simple way of proceeding by successive approxi- 
mation, @,,@,...,@, being calculated using the value of f,(1) determined by the previous 
approximation, but the complicated form of the equation makes it very difficult, for 
example, to evaluate the asymptotic variances of the estimates. 

Other simpler methods which make use of some serial covariances for lags greater than 
p+1 have therefore been suggested. For example, in the case of a second-order auto- 
regressive process, @, and @, could be obtained from 


C,+4,C,+4,C, = 0, C,+4,C0,+4,0, = 0, 


as pointed out by Quenouille (1947, p. 127). 


Again for general p, since p 
La;pr—t)=9 (r > p) 


i=0 


(defining a) as unity), we can use any p equations of the form 
p 
24,0, =® (7) 
i 


(see, for example, Bartlett, 1955, p. 265). As E(C,) > 0 when s > oo one would in general 
expect to obtain more accurate estimates from r = p+1,p+2,...,2p than from other sets 
of p values of r. 

Having thus determined @,,@,, ...,@, we obtain 6%, and G’ as before from the last two 
equations of (5). 

This brings us to the first main problem, the relative and absolute asymptotic efficiencies 
of the various methods of estimation. Let the first method, using serial covariances 
Co, C}, ---,C p41, be referred to as Method A, and the second method, using equations of the 
form (7), as Method B. Then the following questions may be asked: 

(i) What is the best choice of values of r in Method B, judging this by the asymptotic 
efficiency of the estimates ? 

(ii) How does the efficiency} of Method B using these values (or any other convenient 
set such as r = p+m, p+m+l,..., 2p+m—1(m > 0)) compare with that of Method A? 

(iii) What can be said about the absolute efficiencies of Methods A and B? 

To these we may add: 

(iv) How far are the answers to (i) and (ii) affected by non-normality of (w,) or (7)? 

(for (iii) normality of (w,) and (y,) is clearly vital). 

The importance of the normality assumption,which of course in practice will often not 
be realistic, may be regarded as the second main problem. One could of course consider 
other methods of estimation, such as those using all the information contained in the serial 
covariances Cy, C,, ...,C;,, where k is an arbitrary integer exceeding p + 1, but in view of the 
difficulties that arise in dealing adequately with Methods A and B this would hardly seem 
to be worth while. 


+ Throughout this paper the term ‘efficiency’ will always mean ‘asymptotic efficiency’. 


3-2 
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3. EFFICIENCY OF METHODS OF ESTIMATION 

All these sets of estimates are easily seen to have similar asymptotic distributional pro- 
perties—thus n+(@,—a,), n3(@.—a), etc., have in every case a limiting multinormal dis- 
tribution with zero means. Hence to investigate their efficiencies it is merely necessary to 
calculate their asymptotic variances (and possibly covariances if some overall measure of 
efficiency for the whole set is required). In principle this is quite straightforward since one 
has merely to express the estimates asymptotically as linear functions of the C; or the 
sample serial correlations r; = C;,/C, and then use the formulae for asymptotic variances 
and covariances of the C; or 7; (see, for example, Bartlett, 1955, pp. 255-6). However, the 
calculations seem to be very tedious. 

A fairly detailed examination of the case p = 1 was made. Here the situation is par- 
ticularly simple, since Method A then becomes identical to Method B with r = 1,2, and 
(6) reduces to the single equation 

C,+C,a4@=0 (8) 
(for convenience we replace a, by a in this case). The asymptotic variance of @ is probably 
most easily obtained by using the general result 


cov (H,.0;, H,.C,,) ~ = coefficient of z-“ in G2(z) (Sa<)(Za-+)| (¢>u>p), (9) 


where H, = > a, EH; *, E, being the forward shift operator that changes ¢ into ¢ + 1, and G',(z) 
the ir function 5 y,(s)z° of (a) (see, for example, Bartlett & 
Diananda, 1950, p. 112). With p = a = 2 we get 
n var (C,+aC,) ~ constant term in (2(z) (1 +az) (1+az-), (10) 
and from (8) 6(C,+C,a) ~ —y,(1) 6a 
(using the standard notation for statistical differentials), so that 
vara@ ~ var (C,+C,a)/y2(1). 


1 
4 = = 2 o-capepesinisillienateniniapanasnaiagiatamtaatiatiin 
Now y,(1) a and G(z)=0 {a+ (i+az) (1 =| 
1 1 az! \\ 
= 2 _ 
= [A+ male —)}. 


Hence, after a little straightforward algebra we obtain 





—q2 
nvara ~ *—# (1 +2A(1—a*) +241 a}. (11) 


In general, Method B will give C,+€@C,_, = 0 (where r > 2). It can be shown similarly 
that then 


1—a? 
nvara ~ 


aw + 2A(1—a?) +A*(1—a4)}. 


Hence we may conclude that it is best to take r = 2. 
To obtain the absolute asymptotic efficiency of @ we use a result due to Whittle (see, for 
example, Appendix 2 to A Study in the Analysis of Stationary Time Series, by H. Wold, 
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1954, p. 214), which may be conveniently stated for general p. Asymptotically efficient 


o(n) 
estimates of a,, dg, ...,@, and A are given by the minimization of S=n > a,C,, where a, 
s=—g(n) 
is the coefficient of e“’ in the Fourier expansion of v{27f,(w)}—! = {g(w)}-1 say. v, the ‘residual 


prediction variance’, is given by the constant term in the Laurent expansion of 
log 27f,,(w) in positive and negative powers of e®, i.e. 
exp = | “log 2nf,(wj)dw, and ¢(n)=o0(n) (n->0o). 
Thus if 
1+A(l+a,e“+...+a,e) (1+a,e-%+...+a,,e-) 
= (bp +5, e+... +b, e*) (by +b,e- +... +b, e-), 
where bp, 5, ..., 6, are such that the rots of the equation by + 6,2 +... +b,2” = Ohave moduli 
greater than unity, then v = 620? and 
bg(1 +a,e +... +a,e) (l+a,e+... +a,e—™) 


24 2 eee tees ies: MB RE SE 
{g(w)} (by +b, e% + is +b, et?) (by +b,e- +... +, etm) 





> 


the «, being determined by putting {g(w)}-! into partial fractions and expanding these. 
The asymptotic covariance matrix of the estimates (@,, @., A) = @,,8,, ..., 6, 41) Say, 
is then V-!, where V = (v;;) with 


Aa 
<eegllige 


dlogg Clog 9) 


00, 00, |" 


Taking p = 1 we find that the asymptotic variance of the efficient estimate a* is given by 





v;; = yn \constant term in expansion of 


(1—a®) (i +a)? 


(a+) a 


nvara* ~ 

where /1, is the root of the equation 
Aa(z? + 1)+2{1+A(1+a*)} = 0 
whose modulus is less than unity. The algebra involved in deriving (12) is somewhat heavy, 
owing to the fact that 3, cannot be expressed simply in terms of a and A. Details of this are 
given in the Appendix. 
The asymptotic efficiency of @ = — C,/C, is therefore 
—_ @(1 +4) 

(a+ m,)? [1 + 2A(1 —a?) + A%(1—a4)] 
Table 1 gives values of Ey (which is an even function of a) for various values of |a| and A. 
From this we see that the efficiency is near unity only when A or a is small. It is in fact easy 
to show that E,->1 as A> 0 for fixed a + 0 or as a0 for fixed A + 0. For large A, Hy is 
approximately (1 —a*)3/(1+.?). 

Table 1. Values of E4(a, A) 


Ea 





Ja 
A 0-1 0-3 0-5 0-7 0-9 
0-5 1-00 0-96 0-91 0-87 0-90 
1-0 0-99 0-92 0-80 0-71 0-74 
2-0 0-98 0-86 0-66 0-49 0-47 
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There remains the question of the efficiency of the estimation of the other parameters 
by Method A. In dealing with this it is convenient to consider A and o2 = 02 +o”? rather 
than o? and o’*. Here the calculations are even heavier, but it was thought worth while to 
obtain an explicit formula for the asymptotic efficiency of A (it seems likely that an interval 
estimate would be required for A more often than for 2, for example, because it is desired 
to test the hypothesis A = 0, i.e. that no superimposed error is present). 

The last two equations of (5) give 


r, = —a/{1+A(1—a2)}. 


Hence A= (rg —1?)/ (v3 — 13) 


4  1+A(1-a’) 


and OA ~ - a%(1 —a?) [(dr, +adr,) {1+A(1+a*)} +adr,{1 +A(1 —a?)}]. (14) 


Now it is easily shown that 
not varr, ~ (1+ 2p?) v9 — 49,2, + V2; 
no* COV (11, 7%,+47;) ~ a9 + (1 —2ap,—«*) vp, — (a + 2p) Ve 

and no* var (r,+a7r,) ~ (1+a?) v9 + 2ar,, 
(compare equation (10)), where p, = p,(1) = —a/{1+A(1—a?)} and v, is the coefficient of 
z* in G2(z). Thus 

naam (nvar A) ~ Agvy +A, + Agro, (15) 
where 
Ay = (1+a?) {1+A(1+a?)}? + 2a7f{1 +A(1 + a?)} {1 +A(1 —a@*)} + (1 + 2p?) a*{1 +A(1 —a?)}, 
A, = 2a{1+A(1+a*)}? + 2a(1 — 2ap, — a?) {1+A(1 + a?)} {1+A(1 —a*)} — 4p, a7{1 + A(1 —a?)}, 
Ay = —2a(a+2p,) {1 +A(1 +.a2)} (1 + A(1 —a2)} +a%{1 +A(1 —a2)}. 


Now o? = o*{1+A(1—a?)}/(1—a?) 
and Ve = a {A2(1 — a?) + 2A(1 —a?)? + 1 +47}, 
4/ _ g\s 
vy, = Gas {2A(1—a?)?+s+1—(s—1)a*} (s > 0). 


Hence, on expressing the right-hand side of (15) as a quartic in A, we find 
at(nvarA) ~ 1+4A(1 +a?) +A2(6 + 14a? + a4) + 2A3(2 + 8a2 + at —a%) +A 1+6a?+a4). (16) 
The asymptotic variance of the efficient estimate A* obtained by minimizing S is given by 
p(n var A*) ~ A4(1 + apy) {(3a — 40°) nu} + wt + Zap} + (2 —4a*) pi —ap, +1} (17) 
(this formula is derived in the Appendix). 
The asymptotic efficiency of A is therefore 
— atAK{ (3a — 4a) yh + i+ aye} + (2 — 40%) pj —ap, + 1} (1 +a) (18) 
vA wA{h + 4A(1 + a?) +A2(6 + 14a? + a4) + 2A3(2 + 8a? + a4 —a®) +A4(1+6a?+a4)}°  * 


No way of simplifying this complicated expression could be discovered. 
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Values of Hy are given in Table 2. The efficiency of A behaves in the same way as that of @, 
being near unity when A or a is small, and having unity as a limit when A > 0 with a fixed 
or when a -> 0 with A fixed. For large A, Z; is approximately 

(1 —a?)8 (1 + 5a? + 4a4) 
1+ 6a?+a4 7 

From Tables 1 and 2 we see that for a wide range of values of a and A, @ and Q are of 
reasonably high efficiency. The ease with which they may be computed is a considerable 
practical advantage and if the efficiencies corresponding to a = @, A = A are both not much 
less than unity, it would usually not be worth while obtaining estimates by some more 
complicated method which makes use of the information contained in the sample serial 
correlations for lags greater than 2. Computation of the efficient estimates A* and a* seems 
to be more laborious than one would expect. The formulae for partial derivatives of «, with 
respect to a and A are very complicated and the obvious procedure of obtaining successive 
approximations by Newton’s method, taking for example (@, A) as a first approximation, 
is therefore somewhat cumbersome—‘n practice it would probably be a good deal quicker to 
proceed by computing S for a number of values (a,A) and determining its minimum by 
some method which does not require any knowledge of its functional form. 


Table 2. Values of Ey(a, A) 
|a| 





v. Y 

A 0-1 0-3 0-5 0-7 0-9 
0-5 1-00 0-96 0-91 0-84 0-77 
1-0 0-99 0-92 0-81 0-69 0-56 
2-0 0-98 0-86 0-69 0-51 0-35 
10-0 0-97 0-76 0-49 0-24 0-08 


It may be noted that the conclusions regarding the estimation of a are applicable also to 
certain series, not subject to superimposed error, which are generated by a first-order linear 
autoregressive process with residuals forming a 1-dependent stationary process and 
therefore representable by a two-term moving average of a series of mutually uncorrelated 
random variables with a common variance. For then 


Lj +ax,_, = Yyt+by,_4 
with vary, = 67, cov(%,%.) =9 (8 > 9), 


2 tw —iv 

giving 2nf,(w) = ae aay (19) 
Thus the spectral density (which determines the correlation structure) is of the same 
general form as (4) [§2] with p = 1. However, taking |b| < 1 (which gives the ‘regular’ 
moving average representation), we see that for (19) to reduce to (4), —b = ,, the root of 
Aa(z? +1) +2{1+A(1+a?)} = 0 with modulus less than unity. Now /, clearly lies between 
0 and —a (the value corresponding to A = 00), so that for the reduction to be possible, 
0<b<a(a>0)or0>b 2a (a < 0). The efficiency of estimation of both a and 6 for all 
normal series with spectral densities of the form (19) ({b| < 1, |a| < 1), using the above 
method could of course be investigated in exactly the same way; it seems likely that the 
calculations would be equally tedious. 
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The general estimation problem for processes with spectral density given by the analogue 


of (19) for p > 1, viz. 
2 iw ipw —iw —ipw 
Onf,,(w) = o (1+bye +... + bye )(1+b,e- nai + bye J 
(1+a,e%+...+a,e%)(1+a,e- +... +a,e-%) 
is incidentally quite different from that for the process with spectral density given by (4) 
because of the greater number of parameters involved (2p + 1 compared with p + 2). 

A similar examination of the case p = 2 would at any rate be practicable, although the 
work involved would be very great and tables of values of efficiencies would be of triple 
entry. Here, moreover, Method A is fairly straightforward since it gives 

C3+4,C,+4,C, = 0, 
C,+@,C; + (@,C;,/,(1)) = 9, 
with f,,(1) = —@,/(1+4@,), so that it requires the solution only of a quadratic equation, and 
this may be written as an equation for @,, namely 
a}(C} — C3) + 2@, C,(C, — C3) + C,(C, — C5) = 0, 
(see Quenouille, 1947, p. 126, equation (7)). 

For p > 2, however, a general investigation would be a most formidable undertaking, 

and probably quite impracticable. Nevertheless, one might hope that then some useful 


information on efficiency would be obtained by considering particular estimates for various 
values of a,, dg, ..., A. 





4. EFFECT OF NON-NORMALITY 


If superimposed error were not present, the effect of non-normality would not be at all 
serious. For (x,) would be a linear process and so the asymptotic distributional properties 
of the sample serial correlations 7; would not be affected, the joint distribution of n}(r,;—p,) 
(¢ = 1, 2,3,...) remaining asymptotically normal with zero means and limiting covariance 
matrix with elements given by Bartlett’s formulae [see, for example, Bartlett (1955), 
p. 256]. Thus the asymptotic behaviour of estimates @,,@., ...,@, obtained from equations 
of the form (7) with r = 1, 2,..., (which are asymptotically efficient under the normality 
assumption) would remain the same [as can in fact be shown directly, see Bartlett (1955), 
pp. 258-9]. The asymptotic variance of C, = 2 would of course contain an additional term 
(a multiple of the fourth cumulant of x) but very often we would not want to estimate this. 
But with superimposed error (2,) becomes the sum of two independent linear processes 
which in general is certainly not a linear process. Now if (2,) is of this form, so that we can 
ena v= >> (GrY—r +h, 21); 


r 


where (y,) and (z,) are independent completely random series, the contribution of the fourth 
cumulant terms to lim cov (C;, C;) is 


n> oO 


is) 


Xe K(Lp pry, Vrts U4i45) = Kaly) G,G; +k,(z) H;H,, (20) 


v=— 0 


where x,(y), K4(z) are the fourth cumulants of (y,), (z,), G; = 49,9,.; and H; = >h,h,.,,;. 


From (20), it is easily shown that the contribution to lim n cov (r;,7;) is 
n—>o 


{Ka(y) Go + Ke(z) Hy}—4 {eq(z) K3(y) + K gly) K2(2)} (G4, Hy — HG} {G; Hy) — H;Go} (21) 
(ko(y) = vary, K,(z) = varz). 
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This will not be zero for all 7, 7 unless G; = cH; (i = 0, 1, 2, ...) (which in general will not hold 
except when g, oc h, so that the process reduces to a single linear one) or 


Yol¥) + Ye(z) = 9, (22) 


where y, denotes the standardized measure of flatness x,/«3. It follows that for the super- 
imposed error model the formulae for asymptotic variances and covariances of estimates 
of a1, 4, ...,4,,A obtained by any of the methods that have been discussed (Method A, 
Method B, or Whittle’s asymptotically efficient method) will remain valid in the non- 
normal case only when (22) holds (there is no difficulty about their asymptotic normality 
as the asymptotic normality of the sample serial covariances is easily established by the 
method used in Walker (1954)). However, it seems unlikely that (22) would be satisfied 
approximately in any practical situation unless both the original series and the superimposed 
error are nearly normal. Hence if an appreciable degree of non-normality is suspected 
one would always want to take account of the fourth cumulant terms in the asymptotic 
variance formulae. In principle there is no difficulty in doing this, but it makes the algebra 
still more treablesome, and the details have not been worked out even for the case p = 1. 
To obtain consistent estimates of these terms one will of course require consistent estimates 
of k,(w) and x,(7), but it is fairly easy to see that these can be found in terms of 


n n—-s 
Laz/n, ¥ XX}, 5/(n —8) for some s > 0, 
t=1 t=1 


and the serial covariances C;;. 

It is perhaps worth mentioning that the question of the effect of non-normality in the 
presence of superimposed error was suggested to the author by a definite practical problem, 
the analysis of a series of observations of the variation of latitude, which is described by 
Walker & Young (1955). Here one had essentially a bivariate autoregressive model with 
superimposed error, {x,, y,} being the stationary solution of 


Ip — yy + PYa = M+ V1— CP1-1 t+ Preis 
Y,— PX —%Yp4 = No,t + V2,1— BUy 1-1 — 2% 2,1; 


where (v,,), (v2) are independent completely random series representing the basic dis- 
turbances, and (7, ;), (#2,,) independent completely random series representing the super- 
imposed observational error. The main problem was the estimation of parameters x, y 
such that a = ecosy, 6 = e*siny. Estimates « and ) were obtained using the sample 
serial autocovariances and cross-covariances for lags 1 and 2 and asymptotic variance 
formulae derived on the assumption that all the above completely random series were 
normal. There were reasonable physical grounds for supposing the v series to be approxi- 
mately normal but there was some doubt about the normality of the 7 series, although 
this was not sufficient to make it seem worth while undertaking the laborious task of 
modifying the variance formulae. 


I wish to thank Mr D. A. East for the computation of Tables 1 and 2. 
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APPENDIX 


The formulae for the asymptotic variances of the efficient estimates a* and A* may be obtained as 
follows. 


We have b3(1 + ae) (1+ae-) Oo 








\-1 = bhi eat) a 23 
Mo = TX +00) (1400) (28) 
Now the denominator of (23) is equal to 
Aaz-\(z—4,)(z—flg), Where z= e™, Q 
fi, and fl, = pz' being the roots of the equation 
Aa(z? + 1)+2{1+A(1+a?)} = 0. 
This is also equal to (b)+ 6,2) (by +6,z-1) by definition, so that bj = —Aap,. Hence putting (23) into ] 
partial fractions gives 
f 1 Mi Be 
(g(w)}-* = — apy) — ( - 
. " Aa(Uy— fe) \2—fy 2-H 
= = Opty + 8 fy 2-41 = yt) + (1,2). 


A(t, — Ha) 
From this, incidentally, the coefficients of the sample serial covariances in S can easily be seen to be 


a, = (1+a*+ 2ay,)/(1 — fi), 


and Oy =H, = pi-Na(1+p})+,(1+4%)}/(1—pi) (8 > 0). 
5 2 all 
Now en og 
aa aA bo 


1 
1 diene laies =? -1)\-1 _ z)-1 24 
Na(, — Ma) {fy 2-1 — fy 2-1) + py a1 — py 2)*} (24) 
since the constant term in its expansion in positive and negative powers of z must vanish. 
Similarly 











ee 2 A+ Sent 1) 
dais aa_— (“Ss eaz 1 +az Aa(z?+1)+2{14+A(1+a%)} 
_ logo, 2 et 1 1+A(1=a") (| {1} 
“ae l+az 1l+az-! a aA(fy—fe) \2-fy Za 
—at 
= 2(1+az)-2+2-"(1 +az-1)-14 pi a py 2-7" 1 — 271) -1 + wy 2(1 — 2, 2)-4}, (25) 


a®A( (1 — fy) 
since the constant term in its expansion also vanishes. 
The constant terms in the expansions of 


(M2741 — ay 2-4)3 + ye — yz)", {21 +az)-* +2741 +z) “12 
and {fy 2-7*(L — yz)! + fy 21 — fo, 2)“4} (21 + az) 42-1 +273)“ 


are easily seen to be respectively 


2mi(l—mi), 2/(1—a*) and 2,/(1+ay). | 





Some consequences of superimposed error in time series analysis 


Hence from (24) and (25), if we write 
2Vin=U = a vas) 


Ujy2 Une 
where, for example, 








a1 2 
Uy, = constant term in the expansion of (° * “) ; 
2 2 
Ma 2M ‘ 
then Uy, = er (; er = 2ut/A4a2(1 — 43), 


“ew » [Pi oa ou? ) 
= al Ly — [t2) = 


Uj 


+ an 
L+ap, aA(M,— ft) 
_ a a. ) fy(1 + 2ap, + 44) 
© Ata(wi—1) 1+ apy a(1—p2)? 


on eliminating A from the second term in the bracket by means of the relation 








Afap? + (1+a?) a, +a} = —-y,, 
11 _ 72 —a2)\2 2y?2 
—_ —_ hg ee = ( ‘f) (oe - ) 5 
l—a®  a®A(py— fe) \L + apy, @WA(My— fle) ] 1— py 
a 21 ee ee * py(1 + 2apy + p17) 
~ l-a' a(1—p?) l+ap, a(l—y?)? f° 
Now if P=(l+ap,)-, Q = (1+ 2am, +43)/a(1 — 3), 
Mt Mi 
= det U = 4{— a, (2PQ 2_(P = 
ane . Na®(u2—1)8(1 — a2) * Ata — 3)? a i lic | 
rs th 
~ Afa2(1 — a?) (1 + ayy)? (1 — 3)" 
j .. 2 l 2 
Hence nvara* ~ 2u,,/A = ees em 
(a+ #,)° 
Ata*(1 — pi)? (1—a?) (1+ay,)?{ 1 ‘ 
and nvarA* ~ 2ug/A = - - ula+p,)" lca +(1—y7)Q(2P+Q)}, 


which, on simplifying the expression in brackets, reduces to 


A4(1 + apy) {(38a— 4a’) pn} +My + 2am? + (2 — 4a?) pi—apy t+ 03 he 


43 
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Deterministic customer impatience in the queueing 
system GI/M/1 


By P. D. FINCH 


Research Techniques Division, London School of Economics 


1. INTRODUCTION 
In this paper the following single server queueing system is considered. Customers arrive 
at the instants ¢,,t,, ...,f,,... such that the 7, = ¢, —t, 1, n = 1,2,..., tj = 0 are indepen- 
dently and identically distributed non-negative random variables with common distribu- 


«o 
tion function A(x) and finite expectation a = { xdA(x). Customers are served in the order 
0 


of their arrival and if a customer arrives to find the server idle then he commences service 
immediately. A customer will wait for service in the queue only for a time not exceeding 
a fixed time W. If a customer waits as long as W then he departs never to return. It is 
supposed that the service times of customers who receive service are independently and 
identically distributed random variables, independent of the input process {t,} and with 
common distribution function B(x) = 1—e-“ for x > 0. The case when the input process 
is a Poisson process, that is, A(x) = 1—e-**, for x > 0, has been studied by Barrer (1957) 
who obtained the limiting distribution of queue size at an arbitrary instant of time. Let 
w,, n = 1,2,..., denote the time the nth customer waits in the queue, if he departs im- 
patiently then w,, = W, if he does not then w,, < W. Let wu, = s,, —T,4, where s,, is indepen- 
dent of 7,,,, and has the exponential distribution of service time and 7, defined above, has 
the distribution A(x) of inter-arrival time. Let U(x) be the common distribution function 
of the identically distributed random variables w,,. Then 


U(z) = per |” Ay) etvdy if x>0 
0 
« (1) 
=pe® | Alyet’dy if x< 0. 


By relating w,,, to w, in a manner analogous to that of Lindley (1952) and taking into 
account the characteristic property of the exponential distribution we obtain 


Wii = 0 if W,+Uy, < 0 
=w,tu, if O<w,t+u,< W (2) 
= W if W<w,+u,. 


Write F,,(x) = P(w, < x) then it follows from (2) that 
F,,,(2) = 0 if 2<0 
- |" F(z—y)dUly) if O<2<W (3) 


= ] if W<z. 
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The sequence of random variables {w,} forms a Markov chain with state space the finite 

interval (0, W). It follows from the general theory of Markov chains that a limiting dis- 

tribution F(x) = lim F,(x) exists and is independent of the initial distribution F(x). It 
n> @ 


follows from (3) by Lebesgue’s convergence theorem that F(x) is the solution to the following 
integral equation 


F(x) =0 =f 2#<0 | 


[i Fe-navw) if O0<2x<W| 


1 if W<z. | 


The limiting distribution F(x) is the unique solution to (4) because if there were a second 
solution F*(x) we could take it for the initial solution F\(x). From (3) we would have 
F(x) = F*(x) for each n = 1,2,... and hence F*(x) = F(x). 


Introduce random variables c, = W—w,, and v, = —u, then we have 
Ca+, = 0 if Cr+, <9 
=c,+v, if O<c,+v,<W (5) 


= W if W<c,+%,. 


Write G(x) = P(c, < x) = 1—F,(W —2). Then G(x) = limG,,(x) exists and 


G(av) = 1—F(W —-2). 


Further G(x) is the unique solution to the following integral equation 


G(x) = 0 if x<0° 
= [" G(ia—y)dV(y) if O<a< w| (6) 
=x ] if W <2, | 


where V(x) is the common distribution function of the identically distributed random 
variables v,, and is given by 


V (a) 


he on | Aly) e-" dy f 2>6 | 


. 


(7) 


lI 


pbx 
jee 


. 





Atyjedy if «<0. | 
0 
The main result of this paper is the solution of the integral equation (6) when V(x) is given by (7). 


2. SOLUTION OF THE INTEGRAL EQUATION 


In this section we prove the following theorem. 


THEOREM 1. If G(W,~) is the solution to the integral equation (6) when V(x) is given 


Py (7) then @(W,2) = [aM (z)- We) aM (WY? for 0<2< W, (8) 





S- 
[t 


) 
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where M(x) is a function whose Laplace transform m(p) = I ‘ e-P® M(x) dx is given by 
0 


_ P—h+Halp)—a(u)} 
bal —p) {ual p)+p—p} 


and a(p) = | e-P dA (x). 
0 


(9) 


Proof. The solution G(W,x) to the integral equation (6) is a function of two variables 
W and x. It is reasonable therefore to try to find a solution of the form G(W, x) = G,(W)G,(2). 
If a G(W, x) of this form is substituted into equation (6) and use made of (7) it is not difficult, 
though rather lengthy, to determine the functions G,(W) and G,(x). It will be found that 
the solution is of the form (8). In order to prove the theorem here we note that since the 
solution to (6) exists and is unique it is sufficient to substitute expression (8) into the 
equation and determine the unknown function M(z). 

Writing N(x) = ~M(x)—M'(x) and performing this substitution we obtain 


N(x) = [°Me—y)aVy)+ nan) | Nle—y)err dy 


/ i 
= pals) eka tab W) enHW 4 | N (w) e-# du; 


0 
= pa(m) M(0) ef. (10) 


This equation has been derived under the assumption that 0 < x < W, but W is arbitrary 
and the equation is independent of W; thus equation (10) is true for all non-negative values 
of x. 

Introduce the Laplace transforms 


np) = i e-P* N(x)dx, V+(p)= i e-Px dV (x) 
0 0 
then from (10) we obtain 


n(p) ! —V+(p) ar = Par M(0). 


From (7) we have V+(p) = u{a(u) —a(p)}/p—, and hence from (11) we obtain 
n(p) = wa(e) M(0)/{p — 4+ Ha(p)}. (12) 
Since N(x) = ~M(x)— M’(x) we have 
M(0) _ »(p) 


m = poe 13 
We ae (13) 


It is clear from (8) that we need only determine M(x) up to a multiplicative constant, thus 
we can take M(0) = 1. Equation (9) then follows from (12) and (13). 


Example. Suppose that the input process {t,,} is an Erlang #, process with mean k/A, 
that is dA(x) = Aka eA" dx/(k—1)! for x > 0. Then a(p) = A/(A+p)* and from (12) 





_ __ balm) (A+p) 
™(P) = Op) (p—m) + Mak (14) 
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Let Bo, 81, ---> By (Bo = 0) be the (k + 1)-roots of (A+—p)* (p—p)+pA* = 0 ( it is easy to see 
that these roots are in general distinct). Expanding (14) in partial fractions we have 








a 
n(p) = —,, 15 
(Pp) =, rae (15) 
- _ [(p—F,) valu ee _ A+B, +k(B,— 1) 
h K 
nai r “Tae E(p—p)+HA¥ |p_p, nau) (A+B,) 
k 
From (14) and (15) we obtain x, ney (16) 
r=0 —f, 
Taking inverse transforms of (15) we have 
k 
N(a) = Ky+ > K, ef. (17) 
r=1 
But M (x) = e“* — ev” I e-"” N(y) dy and so taking (16) into account we obtain 
0 
. 2£ & 
M(x) =—°+ Y —es. 18 
( be 2 hf, ( 
Thus from (8) the solution to (6) with V(a) given by (7) is 
k uk, -1 
G(W, x) = |Ko+ > K, | [Ko+ 3 5 sof (0<2<W) (19) 
r=1 s=1 LPs 
and the limiting distribution of waiting time F(W,x) = 1—G(W, W —2) is given by 
Fw K B pwl ‘HK, awl (20) 
——, —ePr@) ebr + 5. obs ; 0 
( ie 2, ‘* B | | 2 b— u— By 
In the particular case k = 1, that is the input process is a Poisson process, we obtain 
A 2 -1 ) 
F(W,2) = ! —* pone] ! -—= cour] (A+ u,0<a< W) 
ad # (21) 
aL - "4 
ar (A=y",0<a< W). 


3. RELATION TO THE CASE W= co 


If the traffic intensity p = (ua)-! < 1 and W = ~, then it follows from a theorem of 
Lindley (Lindley, 1952) that a limiting distribution of waiting time, F(x), exists and is the 
unique solution to (4) with W = oo, in this case the solution to (6) with W = oo does not 
exist. When p < 1, W = oand V(z) is given by (7), that is service time has an exponential 


distribution parameter , it is known (Kendall, 1953) that this limiting distribution F(a) 
is of the following form F(x) = 1—me-m-n2, (22) 


where 7 is the unique solution in (0, 1) of the equation 


7 = I e~—™) #2 d A(x). 
0 


We shall show in a heuristic way that when p < 1, Theorem 1 gives a solution F(x) of the 
form (22) when W > oo. 
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see If F(W, x) = 1—G(W, W —z) is the waiting time distribution when p < 1 then from (8) 
OR ia ir ek uM (W)—pM(W—-2)+M'(W-x) (0<2<W). (23) 
15) In particular, assuming that M(x), M’(x) are continuous and letting x > 0+ in (23) we 
sateen uM (W)F(W,0) = M'(W). (24) 


But W is arbitrary and hence equation (24) is true for all W > 0. Replacing W by (W —) 
in (24) and substituting for M’(W — <x) in (23) gives 





16) 
<3 M( W ‘ag x) rt 6 
17) Let us assume that lim F(W, x) exists and write lim F(W,0) = 1—0@ where we assume that 
wo Wo 
0 < 6 < 1. (This is not the case if p > 1.) From (24) we obtain 
. M'(W) ” ‘i 
8) woe mW) Mt). iy 
From (25) and the existence of lim F(W,2) we deduce that 
Wo 
9 lim psc ll het 
wo M(W) 
exists and that lim F(W,x) = 1-6 lim wo lad (27) 
W->c w>+o M(W) 
0) Write R(x) = lim M(W —2x)/M(W) and assume that it is possible to interchange the limiting 
wo 
operation and differentiation with respect to x. We obtain 
. M'(W-2x) 
R'(x) = — lim -- 
a a 
i = -| lim Wn | lim Ma | 
| pse0 M(W—-2)||ws. M(W) 
= —p(1—0) R(z) 
t | using equation (26). But R(0) = 1 and hence R(x) = e-“0-”, Substitution into (27) gives 
Oo 
he an HA 52) = 1-Oe-“-Ha (28) 
ot 
al which is of the form (22). In order to prove the equivalence of (22) and (28) it would be 
x) necessary to show that 0 = 7. We shall not do so here. 
If p = 1 then neither (4) nor (6) possesses a solution when W = oo. If p > 1 it follows 
2) i from Lindley’s theorem that a solution to (6) with W = © exists, but that the solution to 
(4) with W = co does not exist. The case p > 1 is of interest because the solution to (6) with 
V(x) given by (7) and W = oo is the waiting time distribution of a queue with Poisson input 
process, parameter and service time distribution A(x).* For such a queue there exists 
the well-known Pollaczek—Khintchine formula for the Laplace transform of the waiting 
1e 


* That is of the dual queueing process (cf. Foster, 1959). The relationship of the duality principle 
to the queueing process of this paper is not investigated here. 
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time distribution (Lindley, 1952). We shall now show in a heuristic manner how the Pollac- 
zek—Khintchine formula can be deduced from Theorem 1 when p > 1. Let us assume that 
lim M(W) exists when p > 1. (This is not the case if p < 1.) Then 


Woo 


lim M(W) = lim pm(p). (29) 


Ww->o p—->0 


t 
Equation (29) is valid if and only if | udM(u) = o(t) as t > oo (Widder, 1946, Chapter V, 
0 


Theorem 3.6). This condition is implied by the existence of lim M(W). Using equation (9) 
we obtain es 


lim M(W) = =, 


Wo Oe: 
where o = au = p~ and hence from (8) 
G(x) = lim G(W,x) = : ay eee) — Me). (30) 


W->o ja 





Write g(p) = | et G(x) da. Since wM(x)—M'(x) = N(x) with Laplace transform n(p) 
0 


given by (12), we have y(p) = (1-0) [p—pt+pa(p)}. 
Thus a e-P® dG (x) = pg(p) 
0-— 
ae ee _Ml-a(p)] lie 31 
(1a) [1AM (31) 


which is the Pollaczek—Khintchine formula. 

It would be of interest to relate the solution to (4) and (6) to the corresponding solutions 
when they exist, with W = oo. If p < 1 there does not seem to be any simple relationship, 
but if p > 1 we have the following theorem. 


THEOREM 2. If p = {ua}-! > 1, and if G(x), G(W, x) are the solutions to (6) in the case 
W =o, W < «, respectively, when V(x) is given by (7) then 


G(W,x) = L(W)G(z) (0<2< W), (32) 
where L(W) = we“ II". Gy) emdy| (33) 
and G(x) has the Laplace—Stieltjes transform (31). 
Proof. Substituting (32) into (6) we obtain 


L(W) G(x) = V(a— W)+1(W) { pe VaV(y) (0<a< W). 


But G(x) = [ee-navens f° G(a—y)dV(y) («> 0). 
—o —-W 
Hence V(e—W) = KW) |” Ge-yaV) (0<a«<W). 


Substituting from (7) we obtain (33). 


VC- 
at 


30) 


Pp) 


ise 


33) 
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4, THE LIMITING DISTRIBUTION OF QUEUE SIZE 

In this section we consider the limiting distribution of queue size at the instants a service 
commences. It is not difficult to write down formulae for the distribution of queue size at 
other instants of time, for example, at an arrival or a departure (impatiently or otherwise) 
but these formulae are very complicated and do not seem to lead to simple expressions even 
in the case of a Poisson input process. 

Let P,, be the limiting probability that » customers are left in the queue at the instant 
a service commences. P, is the conditional probability of n arrivals in the waiting time of 
the customer starting service given that this waiting time does not exceed W. Thus 


Ww 
P, = F,(W,0+)+ [Aq arcu) 


Ww 
Fe = i {A,,(y) = Ansly)} dF o( W, y) 
0+ 


F(W,2) : 
FW. W—0) (0<a< W) 


and F(W, 2) is the solution to (4) and A,,(x) is the n-fold iterated convolution of A(x) with 
itself. 


In the case the input process is Poisson with parameter A we obtain the following simple 
expressions P, = (1—p)(1+pT,) {1—-pe“ 7}, 
} 4 oe (1—p) pur. {1 —p Deities te (n > 1), 





where F,(W, x) = 


where p = A/u and 


~ n! 


at ws pera IP AY" 


~ 


5. THE MANY SERVER QUEUE 


In this section we discuss in a heuristic way how the results of the previous sections are 
modified if there are N independent and identical serves (N > 1). 

Let m,, denote the random variable which specifies the number of customers in the 
system on the nth arrival. If m, < N then the nth arrival does not wait and if m, < N—1 
then neither the nth nor the (n + 1)th waits for service. If we suppose that m,, > N —1 then 
it can be shown that equation (2) remains valid if the random variable s,, has an exponential 
distribution with parameter /.N instead of ~. The meaning of w,, w,., in this equation is 
modified, however, for they are now waiting times conditional upon m, > N—-—1. To 
emphasize this we rewrite the equation in the following way 


(Wp. |m, > N-1)=0 if (w,|m, > N-1)+u, <0 
n if O0<(w,|m, > N-1)+u, < We (34) 
= W if W <(w,|m, > N—1)+4u,. 


= (w,|m, > N-1)+u 


In the ordinary notation of conditional probabilities we have 
P(Wy41 < x|m, > N-1) = P(w,t+u, <«|m,>N-1) (0<2< W). 


Let us assume that the limiting distributions for w, and m,, exist, we prove the following 
theorem. 


4-2 
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THEOREM 3. The limiting distribution lim P(w,, < x|m,, > N —1) is the solution to (4) 


with U(x) given by (1) and y~ replaced by Ny. 
In order to prove the theorem we require the following lemma. 
Lema I, 
lim P(w,., < «|m,,, > N—1) = lim P(w,,, < «|m, > N-1). 


n—> 0 n->@ 
We prove the lemma by noting that for x > 0 
P(Wayy < @) = P(myy, < N—-1)+ P(myy, > N-1) P(Wyy. < | Myx, > N—-1) 
and =P(w,,,, < x) = P(m, < N-—1)+P(m, > N—1)P(w,,, < «|m, > N-1), 
since w,,, = 0 whenever m,,,, or m, does not exceed (N — 1). 
Assuming the existence of limiting distributions for w,,,, and m, the lemma follows at 


once from these two equations. 
It follows from the lemma and equations (34) that the limiting distribution 


lim P(w, < x|m, > N—-1) 


n—-> 2 


satisfies the integral equation (4) with U(x) given by (1) and y~ replaced by Ny. This proves 
the theorem. Finally, we remark that if Fy(W, «) denotes this limiting distribution then the 
limiting conditional distribution for waiting time given that the customer does wait is just 
[Py(W, x) — Fy(W, 0+ )]/[1 — Fy(W, 0+ )] in virtue of the equation 
P(w, < x|w, > 0,m, > N-—1)P(w, > 0|m, > N—1)+P(w, = 0|m, > N-1) 
(w, <2|m, > N-1). 


I am grateful to Dr F. G. Foster for reading an earlier version of this paper and suggesting 
some substantial improvements in presentation. 
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The polykays of the natural numbers 


By D. E. BARTON, F. N. DAVID anp EVELYN FIX* 
University College London and University of California, Berkeley 


It was pointed out by David (1956) that problems concerning the null hypothesis dis- 
tribution of rank criteria for the random sequence can be simplified if these problems are 
translated into those of samples drawn without replacement from a finite population. Thus, 
suppose a random sequence of NV elements, » of which possess a given characteristic, and 
let the elements be numbered 1, 2, ..., NW. Any criterion based on the ranks of the n elements 
can be treated as a criterion based on n elements randomly drawn without replacement 
from the finite population of the first N natural numbers. This idea leads to a unification 
of the study of such criteria, and moments of the sampling distribution of any criterion based 
on symmetric functions of the n ranks may be written down with comparative ease with the 
aid of theory which has already been set out in general fashion by Wishart (1952). Further, 
the expressions given by Wishart can, for the case of the finite population of the first N 
integers, be considerably shortened by expressing them in terms of the generalized Bernoulli 
numbers. We give here formulae whereby this can be done. 


NOTATION 


N. E. Norlund (1920, section 32) defined the Bernoulli polynomial of vth degree and nth 
order and showed it was the coefficient of ¢’/v! in the development of 


t” ext © t’ 

aap” LB 
The Bernoulli polynomials of the first order, i.e. when n = 1 are those usually referred to as 
the Bernoulli polynomials. When n = 1 and x = 0 the numbers B® are those calied the 
Bernoulli numbers by some authors. Others place these numbers in modulus, while others 
still omit the zero elements. We shall use Nérlund’s notation BY” (x), writing B&” (0) as B®. 
If B™ is multiplied by a function on the right we shall incorporate this in square braces 
to distinguish it from a polynominal argument. 





| 


GENERATING FUNCTIONS FOR THE MOMENTS AND CUMULANTS 
OF THE FIRST N NATURAL NUMBERS 

The probability generating function, assuming all NV numbers to have equal probability, 

is #(1 —t4 
P(t) = oa ) 

N(1-t) 
From this we may deduce the central moment generating function, the cumulant generating 
function, and the factorial moment and factorial cumulant generating functions. These are 





M(t) = Wane’ H(t) = log M(t), 
Fy =" st G(t) = log (1 +t) + log ((1-+#)¥— 1) —log (NA), 


* With the partial support of the Office of Naval Research and of the National Institutes of Health. 
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and 
2 N+1 Bo 
-" (1) ae 
~~ mee 2 ) (reven), x, =(—1—[Nr—1]  (r > 2), 
Mn = (7) (N -— 1)¢-9, Kip = . [ —1yr!+ > 0, tie 1) Bp, BPN] 
j=0 


These results are well known. The factorial cumulants do not appear to be capable of 
simpler expression. 


THE POLYKAYS OF THE FIRST N NATURAL NUMBERS 


The polykays K,,_. are functions of the moments (or cumulants) of the finite population 
composed of N elements. The same function, k,, of the sample of n averaged over all 


possible samples will be K,,,_, i.e. 


En (kyst...) = Tag. 


For example, if 1, and x, are the rth moment and cumulant quoted in the preceding section 


then rap S Ae _ NN +1) 
eee Ia oe 

2 2 

or again Ky = sq (NU +1) ay 840 — 1) pg] = 


and so on. The fact that the K’s are calculated instead of the moments or cumulants is 
solely for ease of manipulation. 


SINGLE SUFFIX POLYKAYS 


From Aty’s (1954) general result (a proof of which we have given elsewhere), for the rth 
central moment, M(1"), of the mean of a sample of n from a general finite population of N we 
see that n’-1M(1") is a polynomial in n, N and the polykays of the finite population. More 
precisely, considering this purely as a polynomial in n with constant coefficients we have 
that n’-1M(1") = K,+ terms in n! and higher powers. Further from the general relationship 
for cumulants in terms of moments we have that 


«(1")— M(1") 


is a finite sum of products of at least two of the {/(15)}, the sum of the weights in each product 
being equal to r. Hence it is true generally that 
K, = lim n’-1K(1"). 
n—->0 
Since we know explicitly the rth cumulant of a sum, w, of n elements chosen randomly from 
the finite population of N natural numbers we can find K, for this population. The Eulerian 
identity, proved by Cauchy (1843), states in statistical terms that the p.g.f. of w is 


gh{n(n+1)] ll (1- 2") 
II(z) = 7} 


N-~w COC 
*O, TI (1—#) Tl (1-2) 
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s is 


uct 


om 
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Accordingly since the rth cumulant of w is 


k,=0 r2>3, (rodd), 





Me r+ ahaa {BEAN + 1)— BRN —n+ 1) — BE} (n+ 1) +BY(1)} (r even), 
for even r lim nk, = ar+1) BPs BOL (a+ |", 
ie. K, = 22 (BON +1)-BP, 
for odd r K,=0, r23; K, = }(N+1). 


UNIT SUFFIX POLYKAYS 
The polykays with suffices which have unity only as a part have asimilarsimpleexpression. 
It is known that ur] Uy 


Ky = NO No: 


where U, is the unitary symmetric function of weight r. We note that U, is equal to 
U, a (- 1)" NOBN+D 


and consequently Ky = (— 1)" B+, 


RELATIONS BETWEEN THE POLYKAYS FOR ANY FINITE POPULATION 


The polykays with more than one suffix and with one or more of these suffices greater 
than unity do not seem generally to be expressible simply in terms of the Bernoulli poly- 
nomials of any order. Aty (unpublished thesis) by inverting Wishart’s equations obtained 
formulae for the general finite population in which the polykays are expressed as functions 
of the single suffix kays. Either using his results or by going back to Wishart’s original 
formulae the manipulation, although elementary, is cumbersome. We looked therefore at 
a further possibility, that of expressing polykays in terms of others of the same weight. This 
has advantages since, for example, the polykays with suffices containing unity only as a 
part are usually easy to calculate for any finite population. Some of our results are set out 
below. They are quite general and may be used for any population. 











oto) ee oe 18 r+2i\(r+i)! (r+s)! 
Boy = Gao 2 yo) a eeenqe tre 
K, =*0,{KE+23S 1" x K,, 
i ‘8 18 F.8 stiC, 8+t a 
= r—)) 7 N +2r— 
Ky K. = wy Kur _ N(N =) 32 27- st+— Fa Ege, 
N 
Kggs-1 yr = 5 (Kone K Kor 147- —Kyys), 
. N-1 ((N+2r- 1 
Keys a 5\( vay oe Ky Ka) —n Kygras— KyrsK) 
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KyK, = 7 Ky ae a) Kew- + tes 
i«t (ee weet 5(K wut iS" Keys vv-sK) |, 
54253 A | Ke K- a So Kral + Ke lt oak 
K.K, hoe 
Kse 2K Ke 


Ky = K, Ki - 








N?2 a 
2K, r+1,1 2 (4) 


K, Ky, = be N Ne - xrC —t+1, int Kay (r odd), 





2Kua1 2 tr ; 


K, Ky = N  N® Py Ci Kyi, is — xi "Cy By, te + Kya (r even). 





From the known expressions for the generalized Bernoulli polynomials (see, for example, 
Davis, 1935), the formulae of the previous sections may be used to obtain the polykays 
descriptive of the finite population of the first NV natural numbers. These up to and including 
the eighth-order polykays are given in Table 1. 


Table 1. The polykays of the first N natural numbers 


Coefficients of 











Dividing a wa — 
factor N®& N? Né N5 N1 N3 N? N N® 
2 — — —— — _ — —_ 1 1 
12 — — — — — — 1 1 ~- 
12 oe oi ~— — — 3 5 2 
— me ae on _ a 0 0 0 0 
24 — — -- — — 1 2 1 ~ 
8 can on bi er sie 1 2 1 Desi 
— 120 —- -— — --- 1 2 1 = — 
120 — — -— -- — 1 2 1 — 
720 _- a - —- 5 6 —5 —6 - 
720 _-- — -= —_ 15 40 33 8 - 
240 wise - a = 15 30 5 —18 =e 
_- — -—- -- 0 0 0 0 0 0 
— 240 — — - ] 3 3 1 — — 
— — — — 0 0 0 0 0 0 
120 — — — 1 3 3 I ~ 
1,440 oo 15 30 5 —18 —8 —_— 
480 — — 5 15 13 ] -—2 
96 —- —- 3 5 —5 -—13 —6 — 
504 — — 2 6 5 0 —1 — - 
— 504 -— —= -- 2 6 5 0 -1 — 
— 10,080 _- _- 7 5 —27 —25 20 20 - 
— 10,080 -— ~- 21 77 81 -9 — 54 —20 _— 
— 5,040 ao — — 6 15 4 —15 —10 — 
10,080 — — ~ 7 17 3 —17 —10 - 
60,480 ~— - 35 21 — 157 —141 122 120 — 
10,080 — - — 63 231 299 153 22 - 
60,480 — -— 105 301 147 — 257 — 252 —44 _ 
20,160 _ — 105 315 161 —275 — 322 — 96 —- 


4,032 — . 63 63 —315 — 539 —74 236 96 
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Table 1 (cont.) 
Coefficients of 
Poly- Dividing r A nee 
kay factor Ns Nn’ Ne Ns Ns Ns N? ™ Ne 
K, — — 0 0 0 0 0 0 0 0 
Ka 1,008 — 2 8 il 5 ~i —J yee aa 
Ks — — 0 0 0 0 0 0 0 0 
i — — 0 0 0 0 0 0 0 0 
Ks: —504 . — — 2 8 il 5 ~1 -1 om 
Mine —20,100 — 7 12 —22 ~52 —5 40 20 ~ 
Kye — = 0 0 0 0 0 0 0 0 
Ky, 10,080 — an 6 21 19 —i —25 —10 es 
Kys 69 3 — 7 28 18 ~68 ~133 ~88 —20 - 
Kyo 10,080 — sat 7 24 20 —14 —27 —10 qe 
Ks, 120,960 — 35 56 — 136 — 298 ~19 242 120 — 
Ky: 5,040 — = 21 84 110 32 — 35 —20 on 
Kay 40,320 — 35 112 56 —138 —131 26 40 : 
Kys 8,064  — 21 56 —28 —210 ~201 —38 16 as 
Ky ar = 9 0 —84 —98 91 194 80 
K, —720 3 12 14 0 9 0 2 st es 
Ky ~ oe 3 12 14 0 a 0 2 i 
Kes 30,240 10 4 = —53 145 133 —84 ~84 . 
Ks, 5,040 — 3 7 ~% — 28 —10 21 41 oe 
Ky 100,800 7 —20 ~62 180 503 120 —448 — 280 
Ky 30,240 30 140 155 ~ 183 —475 ~175 146 84 i 
Ksa1 30,240 — 10 22 —29 ~95 —23 73 42 on 
i. “mae — 7 40 78 40 —57 —80 —28 _ 
Ky —604,800 35 —48 ~380 42 1,225 678 — 880 —672 en 
Ky,  —302,400 — 30 63 —103 —325 —95 262 168 ~ 
Ky» —10,080  — 30 63 — 103 —325 —95 262 168 i 
Ky ~~ — 604,800 105 250 —344 1,410 — 785 960 1,024 200 
Ky 302,400 90 333 297 — 245 —455 —88 68 ae 
Kyo 604,800  — 35 7 —128 — 370 —75 298 168 oe 
Ky 3,628,800 175 -140 —1,754 -176 —5,335 3,340 —3,756 3,024 - 
Kyqs 604,800 — 35 1,290 1,778 —380 —1,765 ~910 —48 ~~ 
Kys  — 201,600 105 420 — 160 2,256 6,145 4,830 1,480 64 
Ky 3,628,800 525 1,190 —1,690 -6,052 —2,275 4,670 3,440 192 
Ky,s 120,960 — 315 1,260 1,200 —1,418 3,419 —2,146 — 400 
Kw 241,920 105 336 14 — 940 —795 444 676 160 
Koy 241,920 315 630 1,890 -6,160  —3,865 3,226 4,288 1,152 _ 
K, 34,560 135 —180  —1,890 —840 6,055 8,140 884 3,088  —1,152 


THE POLYKAYS OF THE HYPERGEOMETRIC DISTRIBUTION 
Suppose a finite population of # black balls and N —f white balls all indistinguishable 


except as regards colour. We may regard this as a finite population of # 1’s and (N — £) 0’s 
for sampling purposes. Since the result K, = lim n’-1M(1") is quite general, application 


in this context yields 


and in particular 


K, = 


Ky 


K, = 


r 


N’ in 


N® 


a 


n—>0 


j (j) 
2, Mor _, Bo 


AGS 


N-f) , _AB(N-£)(N-2f) 


a. N® 


— PNP) nw +1) 68(N — A). 


b 
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The multiple suffix polykay of unit parts is 


po 
Ky = No’ 
From these quantities all the intervening polykays may be deduced from our formulae 
and in particular pon — py® 
Kx =" N® 


Using Aty’s result for the mean of a sample it is possible to write down the moments of the 
distribution of the number, a, of black balls in a sample of m drawn from N without 
replacement. 


Moop’s CRITERION 


For the sake of example and to illustrate the use of the formulae which we have provided 
we consider the moments of a criterion proposed by Mood (1954). Given a random sequence 
of N elements, n of one kind and N —n of another, which are ranked in order of magnitude, 
Mood proposed as a possible test for difference of dispersion the average of the sum of the 
squares of the deviations of the ranks of the elements of one kind from }(N + 1). In general 
terms we may write this as 


: > | 
= ae = (x,—K,)? = me > (%;—k, +k, —K,)’, 
where k, is the sample mean and K, the population mean of the NV elements. Hence for any 
finite population, exclusive of any idea of ranking, 


@=%! 





ky + (k, — K,)?, 


or 0— Kj} = ky+ky—2k,K, =A (say). 
The moments of A are straightforward. Thus 


&(A) = K,+ K,,—2Kij, 


&(A—6(A))? = (; “ x) [K(Ay-) = Kal“ | 


1 2 
&(A—6&(A))> = (5) (; -¥) 
N-2 N-2)2 10N2—48N +6 N- 
«|K(Py) + 12K yo ( Wi y + Ky + 64) + 8K,s! pa 








with a longer expression for the fourth moment which we do not reproduce here. For the 
particular case considered by Mood, the finite population is the first N integers, K, = 4(N + 1) 
and the polykays are those which we have tabulated. On substitution we have, 


6(0—6(9)P = 


n N 


wars (1 i). 








ee 


he 
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N*{N +1)(N+2)(N2—16)/1 1\/1 2 
210 Sa ee ite Fe 
oO —FO)) 3780 & x) (; x 
N(W +1)(N +2) eugice Hom 
nt PPh allt! eles 4 N3—4N2420N ule AR ili is 
&(0—E(6)) — [2 +1)(A 4N24.20N +88)(% awtom = 
+(N —1)(7N*—20N3— 4N2— 64N — 528) (, é x) | 
n N) | 


ALTERNATIVE APPROACH 
An alternative method of approach which is suitable for some rank criteria and which is 
applicable to finding the moments of Mood’s statistic is to make use of Aty’s general results. 
Aty (1954) showed for r < 12 that for any finite population of N elements which has popula- 
tion polykays K,,_. the moments of the mean k, of a sample of n 
M(1") = Ex(ky- KY 

may be written as M(1?) = K,A,, M(1°) = K,As3, 

M(1*) = K,A,+3K,, A} 
ar-2  gr3 


P = ar-l1_ 
and so on, where A, = @ Wt ye 





and a=--— 


Mood’s statistic can be regarded as a mean of n elements each randomly drawn without 
replacement from a population 


(—r)?, (—r+1)%, ..., (—1)%, 0, (1)%, ..., (7)? (N = 2r4+1), 
or (—r+4)*, (-—r+3)’, ..., (—4)%, (4), ...; (r—4)? (N = 2r). 
The polykays of these two populations are the same functions of V. They are 

_ N®-1 _ N(N +1)(N?-4) 
— Ue 


ses N*(N +1) (N +2) (N?—16) en N?(N + 1)? (WN +2) (N?—4N? + 20N + 88) 

- 3780 aes Ss es 
_ 

‘oe a 1) (7N4_ 90N3— 4N2—64N —528). 


Consequently from Aty’s results the moments of Mood’s statistic follow. It will be noticed 
that ifn = AN, where A is constant, the limiting value of the moment ratio /, is 3. 
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The distribution of regression coefficients in samples from 
bivariate non-normal populations 


I. Theoretical investigation 


By A. B. L. SRIVASTAVA 
Statistical Laboratory, Indian Institute of Technology, Kharagpur 


1. InTRODUCTION 


The work done so far on the sampling distributions of statistics arising from bivariate 
non-normal populations mainly relates to the study of the effect of non-normality on r, 
the correlation coefficient. Pearson (1929) showed that considerable non-normality can 
be allowed in the parent bivariate population without seriously affecting the distribution 
of r; Quensel (1938) derived the density function of r in samples from a bivariate Gram— 
Charlier Type A population under the assumption that the population correlation p is 
small, and Gayen (1951) studied the distribution r in samples from populations represented 
by the Edgeworth surface for the case when p is not necessarily small. So far as the regres- 
sion coefficient b,; = m,,/M9 is concerned, only expressions for its first four cumulants 
have been obtained in the non-normal case. Quensel (1938) obtained the expressions for 
the mean and the variance of b,, from the joint characteristic function of the second-order 
moments and Cook (1951) derived the first four moments of b,,, using some generalizations 
of Kendall’s k-statistics. It appears that no attempt has been made to study the frequency 
distribution of b,, when the parent population is non-normal, except for the work of 
Hyrenius (1952) who has obtained the distribution of b,, by considering the parent popula- 
tion to be specified by a bivariate compound normal distribution. Hill (1954) has derived 
cumulants of the function m,, —cmgp using the k-statistics in the non-normal case, and has 
made use of them to calculate approximate percentage points of b,). 

Assuming the parent population to be specified by the Edgeworth surface (including 
terms up to those in A%p, AggAqq, ..-, Ags), the distribution of b,, has been derived here. It 
holds for the values of p not necessarily small, as well as (i) for any size of sample provided 
that fifth- and higher-order population cumulants are negligible, and (ii) for any population 
if samples are sufficiently large. The distribution of the t-statistic used for testing the signi- 
ficance of the regression coefficient has also been derived, and of the two cases of known and 
unknown marginal variances, the latter has been considered in detail because of its obvious 
practical importance. For this, in addition to the normal theory tail probability of t, 
the corrective terms in A4o, Agy, ..-, Agg ANA AZ, AggAgy, --., Ags have been obtained explicitly. 


2. FREQUENCY FUNCTION OF THE REGRESSION COEFFICIENT 
Let us consider the bivariate parent population to be represented by the Edgeworth 





surface Fr .. ly y A 7 \ b(a ) (1) 
(x,y) = ‘wil me Gj! datoyi| OX 9 
1 (Kog%? — 2k, 2Y + Kooy”) | 
where x,y) = ae ( . i ” 
P( y) 27 4) (KapKo2 — X31) ] y KeoKo2— Kit 
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Agy = Kgo, Agi = Ko, Are = Kia, | 
Ag = Kay, Agi = Kg1, Age = Kee, (3) 


2 2 a 2 7 
A gy = 10K%q, As, = 10K 39K a1, Age = 4KgoKig+ 6K3,, Agg = KgoKo3 + 921K 12, 


and the x;,,;’s denote the semi-invariants of the population. We shall use the standardized 
semi-invariants A;; = K;; I(x3hK3i) in the following derivations and denote the population 
correlation coefficient k,/,/(Ko9Ko2) by p. 

Let (x;, y;), (¢ = 1, 2,...,.N), bea random at from the population represented by (1). 
Using Gayen’s (1951) jeclenale (9) and taking A,,’s to be the standardized semi-invariants, 
the joint characteristic function of 


x(a, —%)? U(x,;—% X(y; P 
thigg sa ah x) wigs ( as Ds sli, 3 ee y) (4) 
0; _ LY; 
where .— + eo" ee 
can be obtained by substituting in it the following Pale for his D, 79, 7;; and To 
2 , 2 , 4x2, Koo Koo — K?}) 
D= (1 — Hy Xe0its) (: iy enilon) — WN * (ita) (toe) — a ty) — “a0 o2 oy ™ (itu)®, 


1 2 ; ‘ ‘ 
70 =D ND (P?K 29 ita + Ky, ity1 + Kog toe) — 1, 
: r= (5) 
p errr ; 
ana ae * + WD (2k g9 tt ag + (1 + p*) 4/(K29 Koa) ta + 2pK ye ttoa), 


1 
Tor = D~ ND = (Kaoitay + Kis itys + "Koa ttoa) — 1. 





Gayen (1951, pp. 221-3) derived the joint distribution of m9, m,, and mop from the 
characteristic function expressed by his formula (9) relating to the case when Kg) = Kop = 1. 
But in deriving the distribution of the regression coefficient b,; = m4,/7Mo9 it seems desir- 
able not to impose any restrictions on Ky) and Ky. However, when Koy + Ky2, the joint 
frequency function of mo, m,; and mp». can be derived without difficulty from Gayen’s 
characteristic function (9) modified by our equations (5), and can be expressed by his formula 
(21)* with the following modifications: 

(i) the normal theory term f4(1g9, 4,, M2) occurring in it becomes 


NN-1 
4nT(N — 2) (Koo Koa ~ Kz, kv) 


Ko Mop — 2K34™ Kog™ 
x exp |-3( 02729 ut t! 20 *)) ' (6) 
KooKo2— Ki1 


SoMa; 44; M2) = (Moo M99 — m2, )8N— 


(ii) the quantities M5, /,, and M,, are now given by 
2 
(Kao Koo — Kit) Moy = Koz Map — 2Ky1 M41 + P*K a9 Mop; 
9 
(Ko9Ko2 — Kix) May = Pk o2 Moo — (1 +p?) Vl (K20Ko2) My, + PKo9Mo2, (7) 
2 aa 
(Kao Ko2 — Kix) Moo = P?KqgMoq — 2K 1,4, + Koq Mop, 


* In Gayen’s (1951, pp. 222-3) result (21), inside the square brackets which have — 6A ,A,, as the 
coefficient, Mz. should read as Mo. Incidentally, it may be pointed out that in his formula (39) for 
var (r) on p. 226, in the second line, the term 4(1—,p?) (2+ 11p?) is a misprint for 4(1—*)? (2+ 11p?). 





4) 


5) 


the 


p*). 
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and (iii) the quantity mp) _.—m?, occurring in the formula (21) is replaced by 


(m9 Mo2— ™4;)/(K20Ko2)- 
Let us denote his result (21) so modified by 
G(Mgq, M41, Mq9)- (8) 


In (8), by substituting for M,,, /,, and M,, the expressions given by (7), a typical term can 


be seen to be of the form flog, 445 Mon) MERMIE MES, (9) 


11, Tg) 3 being positive integers such that r = 7,+7.+73 < 3. 
Putting mob, = m,, and integrating out m2. and mp, we find the frequency function 
of 6,, corresponding to (9) is given by 


he rarq(O21) = fo(bex) (5) (snk —Ki,)" 


r T(3N —1+7r,—j) T(3N +714+1724+J) 
13(1, ht2+27 2 : —_——— re 10 
: Pl CPi T(E) PSN — 1) K3§-7(Ka9 03) — 2K 43 D9 + Kog) 7247 |’ ike 





N-3 " — x2,)kW-1) 
where fo(bo3) = 2 a pie ree) (K29Ko2 — Kit) os (11) 
mT(N — 2) x33’ (9963 — 2k 11 Bay + Koa) 
is the normal theory frequency function of b,,. 
Using (10), we obtain from (8) after some simplification, the frequency function of b,; as 
fe N-1 N(N + 2) 
F(be1) = fo(be1) ! + arpa B 
X (Ago Bio — 4A31 Bip Boy + 6Age Big Boy — 4A13 Byp Boy + Ags Bor) 


2N 
= B, [{N (1 — p?) + 2p? + 1} (Ago Big — 2Ag1 By Box + Ave Boi) 


— 6p(Az1 Bio ae 2A By Box a Ai3Bi) Ss 3(Ag: Big — 2013 Bi Boy + Agu Bii)] 
+ [{N?(1 —p?)? + 2N(1 —p*) p® + 4p? — 1} Ago — 4p{N (1 — p*) + 2p? + VAs 


+ 2{N(1 — p?) + 8p? + 1} Ags — 12pA45 + Boul 
= ae +8 


* 12N(V+1)(N+3)0—p*| «+BY 

+ 3(2A 39 Ajo ~~ 3A3;) Bio Bo — 2(AgoAog + 9Aai Artz) Bio Boy 

+ 3(3Aj2 + 2Ag3A21) Big Bor — 6Ag3 A12 Bip Bor + Abs Boi] 

3N(N +2) 
Bi, 

+ 2(Ago Aye + 2A3,) Big Bor — 4A21A12 By Bor + Ain Bor} 

— 10p{Ago Agi Bio — 2(AgoAie + Adi) Bip Bor + (Azo Ags + 5Av1 Are) Bio Bor 

— (Aig + AggAo1) Bip Bor + AgsArz2 Boi} + 5{Adi Bio — 4A21A,2 Bin Bor 

+ 2(2AFs + Ags Aa) Bio Bo — 4A93A12 B19 Bo + Abs Bor} 

— 2(1 — p?)? Kge{(A5i — AgoAr2) Bio + (AsoAos — Aai Ara) Bro Bor + (Aja — Ags Aer) Bor} 


[Ago Boo — 6Az9A21 Bio Bos 


[{N (1 — p?) + 2p? + 3} {AZo Big — 4Ag9 Av: Bio Boy 


3N 
+R [{N(N— 2) (1 —p?)? + 6(N— 2) (1 — p*) + 15} (A§q Big — 2A go Aor Bro Bor + AS Bor) 
11 
— 12p{N (1 — p?) + 2p? + 3} {Ago Aa Big — (AgoAre + Adi) Bry Bor + Aoi Are Bir} 
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+ 30p*{Ag9Ay2 Big — (AgoAos + Avi Az) Byo Bor + Ags A21 Box} 
+ 6{N (1 — p®) + 7p? + 3} (Ad, Big — 2Ag1 Aye Bip Bor + Afi Box) 
— 60p {A112 Big — (Aja + Ags A21) Bio Bor + Ags Ar2 Bos} 
+ 15(Aj2 Big — 2Ao3 Ai2 By Boy + Abs Bou) 
— 2(1 —p?)? Koa{(N (1 — p?) + 8) (AS, —AgoAre) + 3P(Ag0Aos — Azar Are) + 3(Ai2 — Ags Aer) }) 
— [{V(N? — 4) (1 —p?)® + 3N(N — 2) (1 —p?)? + 9(N — 2) (1 —p?) + 15} AZo 
— 6p{N(N — 2) (1 —p?)? + 6(N — 2) (1 —p?) + 15} AgyAn 
+ 18p*{N (1 —p®) + 2p? + 3} AgoAre + 3{(W — 2) (NW — 6) (1 —p?)? 
+ 6(2N — 9) (1 —p*) + 45} A3, — 30p3Ag9 Ags — 18p{2N (1 — p?) + 9p? + 6} Agi Ayo 


i 
+ 9{N(1 — p?) + 12p? + 3} AZ, + 90P?Ags Ag, — 9OPAg3 Ayo + 1598} , (12) 
where Big = PV K29b21 — Koa 
Bu = VK29bo1 — P Koos (13) 
Byy = Keg 5, — 2k, boy + Kop. 


As a check, it may be shown that the integral of (12) over (— 00, 00) is unity. The corrective 
terms due to the various A’s do not appear to be expressible as derivatives with respect to 
Boy (= K4/Ko9) OF p, of the normal theory frequency function of b,,, and in view of this further 
simplification of the frequency function of b,, as in the case of r (Gayen, 1951) has not been 
attempted. 

3. MEAN AND VARIANCE OF THE REGRESSION COEFFICIENT 


We note that the typical term of (12) is of the form 
0 (P \/K20 521 — Koa)” (VK29b21 —?p VKoa)”* (be) (14) 
(K29 53, — 21 ba) + Koa)* . Te 
ry, 2 taking the values 0, 1, 2,...,6 such that r,+7r, < 6 and s taking the values 0, 1, 2, 3. 
It can be shown that kth moment about zero of b,, corresponding to (14) is given by 


KAGE H+ y-29( 1 — p2)brytra-20) [ [uJ(1—p*) +p [pu—/(l- pw 5) 
xik B(4, 4(N — 1) 


Hi(Do1),, 8 i (1+ y2)tN+s 


The evaluation of the integral in (15) is quite straightforward. With its help we can derive 
the mean and the variance of b,, directly from (12). They are found to be 


4 Res 4(N —2 
a ( ene 
E (by) nal (=) le+ N(N + 1) (PAs — Aaa) - N(N + 1) (W +3) PAdo Asean) (16) 


N- ee 
and V(b.) = (NV > Kop ja ~P*) + WoW pl -6)p*+ LAgy— (A — 5) (2pAg; — Azz)} 


4(N —2 
— a £3 2-9) 08+ TIA, 22N — 9) pga 
+ (N~6)Ag, +(N—3)AgoAus} (17) 


We shall now compare the formulae (16) and (17) with the well-known results of Quensel 
(1938) and Cook (1951). Quensel derived the formulae for the mean and the variance of 
the regression coefficient from the joint characteristic function of the second-order moments 
which is valid for small p only. Though his final expressions include terms up to those of 


QDR ere -~, 


14) 
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order N- only, from his formulae (133) and (134) (Quensel, 1938, pp. 106-7) the full 
expressions for H(b,,) and V(b.) can easily be written retaining, however, only those terms 
which involve cumulants (and products and squares of cumulants) relevant to the present 
work. Keeping in view the fact that in Quensel’s work the coefficients of the order p?A3, 
have not been taken into account, we find that these expressions agree with our formulae 
(16) and (17) except for the term in p°A2, in (17). It is thus observed that the mean and the 
variance of b,,, check with those derived by Quensel. 

If we express our results (16) and (17) in powers of 1/N and retain terms up to those of 
the order N-*, we get 


Koo\* N-2 | ae 
E(by1) = (<=) loa (PAgo — Agi) — FalPo—Awda)| (18) 
and 


Kon |N +3 
Ym) ~Fol 3 NE i-f)}+=s sy (P*Aay— 2 ay + Aa) 


= ms {(5p? — 1) Ago — 8pAgy + 4Ag9} — N2 * (2p, — 4pAgyAgy + Ago Aiz + 2). (19) 


Cook’s formulae for the mean and the variance of b,,, up to the order N-* for a general 
bivariate population with cumulants existing derived by using the ¢-statistics, agree with 
our results (18) and (19), provided only those semi-invariants ,and their products and 
squares) which have been considered in the present study, are retained in them.* 

We may now consider a few particular cases of the population form. If the variate x has 
a normal distribution (so that A;, = 0), we have 


$ N-1 
— (“oe 
Biba) = (5) |o- xaveay*: (20) 
and 
(N —1)(N—5) 4(N — 2) (N —6) | 
sjgiiilee eee. on dh haz cae Se 
V(be1) = (N — 3): rr ja p*) N(N+1) (2pAg, A22) N(N +1)(N +3) 21)" (21) 
In the case of strictly linear regression, when A,, = pA;,,;,9 (Wicksell, 1933), the formulae 
(16) and (17) reduce to 


4 
E(by) = (=) P= =a, (22) 


: 
and V(b,;) = (N = me la —p*)— N(N pple 1] Ago -- (NV —5) Azo} 





4(N —2) 
+ N41 paves) Li - 2) p? — 1] AS — (N — 3) Aged} (23) 


If the two variates x and y are uncorrelated, then A,; = 0 for 7 + 0,7 + 0, and consequently 








E(b.,) = 0 (24) 
K, N-1 4(N as 2) 
and V(ba) = (W—3)%e ! + N(W4 1)” NW +1) (+ 3%]: ie 


* It may be remarked here that since Cook’s formulae for the mean and the variance of b,, contain 
all the terms of the order N-1 and N-? (e.g. those in Ago, As, Ado, Ago Agi, Cte.), they will be more accurate 
where our assumptions about the parent population are not true and where in addition to the knowledge 
On Ago, Ago, etc., we have some knowledge on Ajo, A;;, etc., also available, provided N is not very small. 


5 Biom. 47 
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If, in addition, x is a normal variate 


K 
V(ba) = 3) Kay” a 


which is the same as the variance of b,, when both the variates are independent and normally 
distributed. 

Quensel discussed some of these special cases, retaining terms up to order N- only in 
his formulae. We find them to check with the above results. 


4. TEST OF SIGNIFICANCE OF THE REGRESSION COEFFICIENT 


Suppose the null-hypothesis to be tested specifies that the population value of the regres- 
sion coefficient of y on x is £,,. In the case when the marginal variances are known, we use 
the statistic 


HY 
ay Fe 27 
t= (Bt) Va On Bad (27) 
for the purpose, which follows Student’s t-distribution with N —1 degrees of freedom in 
the normal-theory case. 


In the case of unknown marginal variances, we use the statistic 


nal M(N - 2) (mu — Maoh) (28) 


M(mmo9 moe —mi;) 
which is known to be distributed as Student’s t with N — 2 degrees of freedom in the case 
of a normal parent population. 

It is straightforward to derive the distribution of ¢’ from the frequency function of },, 
expressed by (12). The derived expression is independent of Ky) and Kos, but owing to its 
lengthy form, we do not propose to give it here. 

We proceed to derive the frequency distribution of t defined by (28). In doing so, it may 
appear unsound to assume that the higher-order population cumulants are known while 
the second-order moments are not. But as has been remarked earlier, here our objective 
is to find roughly how much non-normality can be allowed in the parent population without 
seriously affecting the normal theory results, and the derived distribution of t will be quite 
useful for the purpose. 


Let us start with the joint distribution of the second-order moments denoted by (8), 
and consider its typical term 


So(™aq, M41, Mog) M3, MH, MiR(mMeaqMo2— mi,)%, (29) 


where rj, 72, 7, take the values 0, 1, 2, 3 such that r = r,+7,+73 < 3 and stakes the values 
4. 


Putti N- 
=e Moz = Man f > (M4 — Mao Ban)? + mit (30) 
Meo 


in (29), we have to integrate our m,, and mg, in order to get the corresponding frequency 
function of t. This can be done by making the substitution 


O= ~i— os “i, 11— Mf)", (31) 
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and evaluating the integrals. We find 














NN-\( N —2)}N-2)+8 Koo(1 — p2)8N—D+4r+28 po Nm Ay 
Gr, rary) = — 4n1(N —2) KN 1rtepN-14287'r I ¢ (- a) May *tT+2s 
x [exp |- = em ") 290| O1-aeirpt0 — 2pT64 + T): (pO — T04)" dO dmg 
0 2 
‘ (32) 
_ 27#(KgoK ya)? (1—p2** 
~ Nr+2s7 7] + t2/(N— 2)}° Jolt 
PL3(N —1)+8+3k) T[R(N —1)+8+r—- $k] Th ; 
, lf) ———; ; 32) bis, 
«2 TRV — HP ee oe 
2 
where = ot (33) 


Jo(t) is the frequency function of Student’s ¢t with N — 2 degrees of freedom, and a,(p, 7’) is 

given by r 

(p20 — 2pT0* + T')"1 (p6 — T04)20"s = > a,(p, T) 0%. (34) 
k=0 


Using (32) bis, after some simplification, we can write the frequency function of ¢ as 
obtained from (8) in the form 


‘ eat eee -1)?- PGW) |", 7 
a) = 910 |1+ srr papa — DVO Paull —)P-1)+2( ey) ae 





- FED OFS DAT — 10-09 [Zon — 7-1} +W—1)an(1- 3) | 
+ (mqar—y) 78 +4N(1—/p?) a4 (5 T - 1) + 7 (1 — p*)ao, (1 _ )7|\|. (35) 
where = 4, = P?Agy— 2PAg, + Avo, 

Oyo = PAgg— (1 + 2p?) Agy + BpAg2 — Ajg; 

Oy, = PAS — 20(1 + p?) AgoAgs + 2P*AgoAye + (1 + 3p?) AZ, — 4pAgy Ayo + Af, 

Gao = AgoAye — Ady — PAgo Ags + PAs Are — Aig + Avi Ags, 

Gag = PASy— (1+ 4p?) AgoAgy + 2p(1 +p?) AgoAre + 2p(2 + p?) Ad 

— p?Agq Ags — (2 + 7p?) Agi Aya + BPAZy + 2PAg Ags — Ar2Ags; 
Ohog = P®A5y — BP7AgoAai + PAgso Are + 2PAR — Aor Are, 
Ga, = WPAgyAy2— 2PAZ, — AgoAgs + Avi Ar2- . 


j (36) 





The distribution of ¢t is again independent of kK.) and Ky,. The upper tail area of the 
distribution is given by 


Peet) = [oat 


m oi N — 1) (1—tug)¥1 (WW — 1) (1p?) ub 1 
= #[1-1,,(4,4N ae | RW a 8 * sa anal 
_ _(N=2)(1—u)#4 (N11) V(1- p*)uk (N-1 

N(N +1) (N +3) (1- aul Gr 302+ 20) 


FP ati 2) uy Utne + (ON + 1) ty Tag}], (87) 








B(t, N —1) 





1 
30ta +- 
* 2B, HN — =n ps 
where u» = tj/(t§+N —2) and 1,, denotes the incomplete beta function. 


ae 


5-2 
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It is of interest to see here that as N tends to infinity, the corrective terms do not all 
disappear, and the formula (37) takes the form 


P{t > to} = —— . ete dey — “lo __ 6-44, (38) 


This shows that for large NV, only p and the coefficient a, (= p?Agg — 2pAg, + Agg) will be 
predominant in contributing towards the effect of non-normality. Thus ordinarily for a 
given t,, the contribution of positive ~,, will be to increase the normal theory upper tail- 
area and that of negative «,, will be to decrease it. 

Generally, a two-sided test with equal tail areas is used when the null hypothesis simply 
specifies the value of /,, and the alternatives can be of any form. To examine how the tail 
areas are affected by parental non-normality in such a case, we may use the formula 


Pile] > th =|" gitar |” trae 


(N —1) u(1—w,)89 
= ORY D1 wari) p) BEN 


2(N —2 N-1 
x lw —1)o%1— Gy ner {2a +75 as (39) 


5. FurRTHER WORK 


It is proposed in a continuation of this paper to study numerically the order of the effects 
which the various cumulants of the population will have on the mean and the variance of 
b,, and on the probability points of t. A comparison of these with known results will also 
be made. A considerable part of the computation involved is already finished and it is 
hoped to be able to publish the complete results in the next issue of Biometrika. 


I should like to acknowledge my indebtedness to Dr A. K. Gayen for his kind advice and 
help in the course of preparation of this paper and also to Prof. E. 8. Pearson from whom 
I have received helpful criticism on certain aspects of this work. 
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The separation of molecular compounds by countercurrent 
dialysis: a stochastic process 


By MARVIN A. KASTENBAUM 
Mathematics Panel and Biology Division, Oak Ridge National Laboratory* 


1. BackGROUND 


Peptides isolated from rat-liver microsomes may be separated from contaminating 
amino acids, sugars and salts by countercurrent dialysis. A single such dialysis can achieve 
only fractional purification. Therefore, more complex systems of multiple dialysis, which 
achieve a higher degree of purification, have been devised. 

The system with which this paper will be concerned is one originally suggested by Craig 
and associates (Craig & King, 1955, 1956; Craig, King & Stracher, 1957), in which a linear 
series of cells is used. Each dialysis cell is termed a stage, and each dialysis period is termed 
a cycle. After each cycle, the dialout} at each stage is concentrated and used as the dialint 
in the succeeding stage, whereas the dialin at each stage is concentrated and returned to 
the dialysis sac at the previous stage. The dialin at the first stage is retained at the first 
stage; and the dialout at the final stage is taken out of the system. In the nomenclature of 
stochastic processes, the first stage represents a reflecting barrier and the last stage an 
absorbing barrier. The problem is to determine the probability with which a particle of the 
isolate will be at a specified stage in the system after a given number of cycles have been 
carried out. 

2. FORMULATION 
2-1. The diffusion matrix 


Consider an m-stage system. Let p be the probability that, at any stage, a particle inside 
the dialysis sac will not penetrate to the outside volume, and (1—/p) the probability that it 
will penetrate to the outside. Also specify that, once outside, the particle cannot return 
inside, except by manual transfer. Then the matrix of diffusion probabilities for m stages 
is the square matrix of order 2m given by 


oe = Se 
Babin cles aa (21) 
Peto ge 
where d,=[f oy (for all j = 1,2,...,m) 


is the matrix of diffusion probabilities for the jth stage. The odd rows and columns of D 
correspond to events taking place inside the dialysis sac for successive stages and the even 
rows and columns correspond to events taking place on the outside for successive stages. 

* Operated by Union Carbide Corporation for the U.S. Atomic Energy Commission. 

+ These terms have been suggested by Anderson & Pennington (1959) to dispel the ambiguity in 


the term dialysate. Convenient abbreviations for the solutions inside and outside the dialysis sac are 
‘dialin’ and ‘dialout’, respectively. 
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2-2. The transfer matrix 


A cycle is defined as a constant interval of time commencing with the transfer of material 
into the dialysis sac (the beginning of the diffusion process) and concluding at the end of 
the diffusion process. 

The transfer matrix, 7’, is a square matrix of order 2m whose elements are probabilities 
equal to zero or unity and which may be constructed according to the following scheme: 
At the end of each cycle 

(i) a particle on the inside at stage 1 will remain on the inside of stage 1 with probability 1; 
(ii) a particle on the inside at stage j will be transferred to the inside of stage (j— 1) 
with probability 1, for j = 2,3, ...,m; 

(iii) a particle on the outside of stage j will be transferred to the inside of stage (j + 1) with 
probability 1, for 7 = 1,2,...,(m—1); 

(iv) a particle on the outside of stage m will remain on the outside of stage m with 
probability 1. 

The system may be completely described by recursion formulas as follows: Let P,; be the 
probability that a particle is in the inside volume of the jth stage at the end of the ith cycle, 
and (1—p) P,,;/p the probability that it is in the outside volume, where, for all cycles and 
stages 0 < p < 1 is the probability that a particle inside will stay inside, (1—) is the 
probability that a particle inside will penetrate to the outside volume, and where the 
probability is zero that a particle will pass from the outside volume to the inside volume. 

Then the probability that a particle is in the inside volume of stage j = 1, 2,...,m at the 
end of cycle i = 1, 2,...,n is given by the recursions 


Py = P11 + Pre), 
Pj = (1 —p) Piisj-a + pPi_s 41 [for j = 2, 3, eee (m— 1)], 


and Pom = (1 —p) | eee 
where Py =1, Py =9 (for 7 = 2,3,...,m), 
and P,;,=90 (fori <4). 


2-3. The stochastic matrix 


If D and T are two square matrices as defined, then their product D7’ is a square matrix 
of order 2m with non-negative elements and unit row sums. The elements of this matrix 
are the conditional probabilities that a particle will make a transition from one of the 2m 
states to another. Such a matrix is a stochastic matrix. 

Then if 7 = 1, 2,..., is the number of cycles through which the system is to be carried, 
(DT is the matrix of conditional probabilities at the beginning of cycle (i+1). By the 
same token, (D7')‘ D is the matrix of conditional probabilities at the end of cycle (¢+ 1), 
before transfer has been carried out. 

If, as is the case in this system, all the material to be dialysed is placed in the dialysis sac 
at the first stage, the initial probabilities associated with the 2m states are represented by 


the 2m-dimensional unit vector e, = (1,0,0,...,0). (2-2) 


The probabilities at the beginning and the end of cycle (i + 1) are then respectively 
e(DTy and e,(DT)'D. (2-3) 





ul 
of 


1e 
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The discussion in the succeeding sections will involve the algebraic reduction of the 
matrix (DT) for a three-stage system and then for the general m-stage system. 


3. THE matrix (DT)! 


By the rules for its construction, the transfer matrix, 7’, will have (m— 1) null columns, 
namely, all those columns, except the 2mth, involving transfers to the outside volume. 
Hence the matrix DT as well as its power (DT7')‘ will have (m— 1) null columns whose posi- 
tions are known. This property suggests that an initial simplification would be to reduce the 
matrix DT of order 2m to a matrix DT of order (m+ 1), by eliminating the (m—1) null 
columns and their respective rows. The resulting matrix is of full rank, and is given as 








Tp (1—p) 0 0 2 Be 0 o 7 
p 0 (1—p) 0 << 0 0 
0 Pp 0 (l-p) ... 0 0 0 
DT =|: : : : .. 2 : : (3-1) 
0 oO 0 . «on 
0 0 0 0 “e< 0 (l—p) 
lo («OO 0 a 1 
Let A be the Jordan normal form of DT. Then we may write 
(DT) = MAiM, (3-2) 
where M is a square matrix of order (m+ 1) of characteristic vectors resulting from the 
solutions of the equations x[DT—AZ] = 0 (3-3) 


and where x is the (m+ 1)-dimensional invariant row vector (21, 2%», ...,%,4)- In this in- 
stance, A may be shown to be a diagonal matrix of order (m+ 1) whose elements are the 
roots of the determinantal equation |DT—AZ| =0. (3-4) 


Thus there will always he one known root equal to unity, and m unknown roots whose values 
may be determined from the solution of the mth degree polynomial equation in A 


ok tay” © oa a 0 
S We tep!) Bau @¢ © 0 
0 p —-A l-p... 0 0 0 
Vm(A)=(-1)"| : : : : : : ; : | =0. (3-5) 
0 0 0 0 .. =A l-p 0 
0 0 0 0 im 2 —A 1-p 
0 0 0 0 = 9 p —A 








4. THE REDUCTION oF (DT)‘ ror m=3 


The diffusion and transfer matrices for a three-stage system are each square matrices 
of order six and may be written respectively as 


-p (l-p) 0 0 0 0 7 
0 1 0 0 0 0 
0 0 p (l-p) 0 0 
_ 4-1 
” 0 0 0 1 0 0 (#1) 
0 0 0 0 p (l—p) 
LO 0 0 0 0 a 
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and rl 0 000 0 


om Oo 


0 
0 
of (4-2) 


- oc 


0 
LO 








oo co Oo O&O 
oo coc 0 O&O 
oor Oo GO 


0 1- 


Their product DT is of order six and of rank four, containing two null column vectors in 
the second and fourth positions. By eliminating the two null columns and their respective 


rows, we form the matrix 
p 1-p 0 0 


_ |p 0 l-p 0 ' 
ais ae (4:3) 
0 0 0 1 


The characteristic equation (3-4) yields one root equal to unity and three unknown roots 
A,, A, and A, whose values may be determined from the polynomial equation 


Wa(A) = A®— pa? — 2p(1—p) A+ p(1—p) = 0. (4-4) 
It follows from (4-4) that 
Ay tAgtAs = p = 0, 
AyAgtAyAgtAgaAz = —2p(1—p) = a, (4-5) 
and A, AQA;, = —p*(1—p) = Gs, 
whence it is easy to show that 
(Ay 1) (Ag—1) (Ay—1) = -(1-p). 


The matrix of characteristic vectors, M, is formed by solving for x in equation (3-3), 
when m = 3. Equation (3-3) in this case reduces to 


(l1—p)%3+(1—A)a, = 0, 
(l—p)2,—Az, = | 





(4-6) 
(1—p) x, —Ax,+ pa, = 0 | 
and (p—A)x,+ px, = 0. 
When A = 1, x, = x, = x, = 0, and we set x, = 1. For A = A; + 1(j = 1, 2, 3), set 
tq = (1—p)?/(A;—1). (4-7) 
Then t= a = (1—p), 
— Ass _ ' 
%=7 =" A (4-8) 
_Aj%_— pr, Aj 
and 4 Ie eee 








é, 





ts 


4) 


5) 


5) 
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and the matrix of characteristic vectors may then be written as 





















































AG (l1—p)*) 
ag (l1—p)? 
a r i= NE ee, 
welies * ™ (1—p) 21 I (4-9) 
Ay (l1—p)? 
1—p p as (1—p) it 
0 0 0 se 
The determinant of WV 
a2 
J -p A (1-p) | 
es | [ab A 1 
\M|=| 7-57? dp) =| A, 1) =A. (4-10) 
| x | [ag a, 
cee ee 
Then 
j | 1 
A} =a 
1 A lA lA 1 
we = 2 ass pee 1 ee 1 fe i 3 eS 
6M =F] (1—p) a, 1)’ (1 P)) a. ip? PY). , wl a 1 ya | 
1 
i ince 
1 
= qld —p) ha (1—p) Van, (1—p) Va, A], (4-11) 


where ¢, is the four-dimensional vector (1, 0, 0,0), and V;, is the co-factor of the element in 
the jth row and first column of A. Set 











A, 9 0 0 
Ae, &.0 :@ 
ase 6 A; OV 
0 0 01 
-1)\i (l—p)AiWy (1—p)ARV, (1—p) ARV ey | 
1At — = ; ae 
Then 6,M At =| A : 2 ; A a 
and 
i 1— 1—p)? (1—p)® 
e MAM = e(DTY = [5 (Rise P-2) RY, OS Ra, OSPF R, 1- “SPP, 
(4:12) 
3 t 
where R,= DAV, and U,=-> a Va. 
j=1 j=1 Aj—1 
It follows that 
1A, 1 Sm 2 | | az Ay 1 
Ro=|1 Ag 1)/=0, R,=|A, A, 1/=0, Re=| Az A, 1/=A, 
1&2 ae ao Wz As 1 











and R,; = pR,_1+2p(1—p) R;.-—p{1—p)R;_, (for i = 3,4,...,). (4°13) 
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Note that the coefficients in the recursion (4-13) are identical with those of the polynomial 
(4-4). Also 














1 
a 
: t. A 
U,=-| - A, 1|=-— 
; ‘4-1 * (1—p)? 
| 
1 
A; 1 
b= ee 
and, in general, U,=U,4—-R,, (for + = 1,2,...,2). (4-14) 


Thus the vector (4-12) may be evaluated for any i by applying the recursions expressed in 
(4-13) and (4-14). 

The six-dimensional vector of probabilities at the beginning of cycle ({+1) may be 
written directly from (4-12) by replacing two zero elements in the same positions from which 
the null column vectors were previously dropped. Thus 





¥ 1 = 1—»p)2 
e,(DT)' = [x (Be2—ll-P) Ry, 0, Oe Ray 0, ( 2) R;, 


1-252] (4:15) 


and 





~ = 
é,(DT)' D = F (Riss —p(1—p) Rj}, eT —p) Rj}, te Ris. 
A A A 
1—»)? 
OOP Ruy 





oO 2, 1-25 y,,.]. (4-16) 


5. THE mTH ORDER POLYNOMIAL IN A 
If y,,(A) be the mth order polynomial in A, then the recursion 


WmlA) = AY m—alA) —p(i —p) Vim—2(A) (5-1) 

holds, with the initial conditions that 
Yo(A)=1 and y,(A)=A-p. (5-2) 
A particular solution of (5-1) is y™ = (A) (5-3) 


if y is either y, or y,, where y, and y, are the roots of the quadratic 
y?—Ay+p(1—p) = 0. (5-4) 


Moreover, any solution of the recursion is a linear combination of any pair of independent 
solutions. But y,,(A) is the particular solution whose values are 1 and A—p at 0 and 1, 
respectively. Hence, we may write 


Vm(A) = Byyt + Bayt, (5°5) 
whence it follows that 
B,+B,=1 and B,y,+B.y, =A-p. (5-6) 


The simultaneous solution of equations (5-6) is 


1 


B, =4=P=% ong pn —A-9) 
Y¥i-Yo Y¥1—- Ye 


, (5-7) 
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al which, when substituted into (5-5), yields 
WmiA) = payor +1) — p(yi' — y2')}- (5-8) 
Now let ¥%=re* and y,=re™, 
where r=,/(p(l—p)] and cos@= a sin > 0. 
‘ r™ sin (m+ 1)0—pr™—1 sin mo ’ 
; Then Vin(A) = oy (5-9) 
ty sin (k+1)0 : 
n If t = cos 0, then C,,(t) ad a a (5 10) 
e is the kth-order Chebyshev polynomial of the second kind. Thus one method of evaluating 
h the roots of the mth-order polynomial in A would be to solve for t = —}A,/[p(1—p)] in 
the equation » 
Cult) | (725) Cnt) = 0, (5-11) 
») 
-i(t) J (* =") ‘ 
or Cn— = ——}. 5-12 
Cn(t) P ra 


The polynomials C;, can be obtained directly by employing the methods outlined by House- 
holder (1958). 
| Though explicit values of the unknown roots are not available in general from (5-11) or 
) from any other solution of the mth-order polynomial equation, the elementary symmetric 
functions of these roots, derived from the coefficients of the polynomial, may be used in 
evaluating (2-3) explicitly. The coefficients of the mth degree polynomial may be obtained 
formally from (5-1) by use of generating functions in the following way:* Write 





WmlA) = bg + bOPA+...+bmA™ (5-13) 
and Wte,A) = ES Ym(A)e™ = 5 AK TS dhe. (5-14) 
m=0 k=0 j=k 
) From (5-1) and (5-2) we obtain 
= at ae 1—pz tel . 
FRA) = 1—Az+p(l—p)2 1+p(1— alt 1+p(l—p)z*| ~ ales 


} Expanding (5-15) in powers of A, and letting g = p(1—p), we obtain 





) 5 aa — pz) zkAk 
(z,A) = = ee (5-16) 

Thus from (5-14) we may write 

z bi.2i = (1—pz)2* : - E ‘) qza+ vs. qrzt — 4 . (5°17) 
) ; 
h bere — (—-1(**8 : 5-18 
whence gee = (—I7 [p(l—p)] (5-18) 
k+s 

and be+ee+1 — ( — 1)8+1 - pt\(l—p), for k=0,1,2,.... (5-19) 

) * The author wishes to express his gratitude to the referee who, in addition to making other helpful 


suggestions, provided the method for directly evaluating the coefficients of y,,(A). 
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Before passing to the ultimate reduction of (DT)! it might be worth noting that the roots 
of the mth-order polynomial are all real and distinct. That they are real may be demon- 
strated by a simple contragredient transformation on DT, which reduces it to a triple 
diagonal symmetric matrix; that they are both real and distinct may be shown by a direct 
application of the argument presented by Householder (1953, pp. 181-2). 


6. THE GENERAL REDUCTION OF (DT)‘ FoR any m 


Consider the square matrix DT of order (m + 1) as given in (3-1), and let x be the charac- 
teristic vector as defined in (3-3). If (3-4) is the determinantal equation, then A = 1 is 
always one root occurring with multiplicity 1. Associated with this root is the (m+ 1)- 
dimensional characteristic vector 

Ens = (0,0,...,0, 1). (6-1) 


In general, the elements of x when A = A; + 1 (for j = 1,2,...,m) may be found from the 
simultaneous equations 
(1 —P)%p+ (1 —Aj)Xmi1 = 0, 
(l—p)z,,-1—A;X, = 0, 


j“’m 


(6-2) 
(1—p) ay, —Ajtyyy+PXp42=90 (for k = 1,2,...,m—2), 
and (p—A;)a,+ px, = 0, 
—myn-1 
by first setting Sniy = eS. , (6-3) 
A;-1 
Then Lm = (1—p)”-?, 
= can m—3 | 
em—-1 oe Pp) ? (6-4) 
Up — Pe 
and = 2 7 es for k= 1,2, ..m—2), 


Let M be the square matrix of order (m+ 1) of characteristic vectors so generated. It may 
be demonstrated that the determinant of M 








a we nnd: 3 

m—1 m—2 | 
[at] = (1—pylimim-sn) “E i 4 As a (6-5) 

| Am-2 Am? A, 1 | 

= (1— p)himin—3)) Il (A;—A,) = (1 — p)dimim-3)) A, 
j<k=1 
= 

Then ¢,M-1 = | mal en cenit (6-6) 


where ¢€, is the (m+ 1)-dimensional vector (1,0,0,...,0), and Vj, is the co-factor of the 
element in the jth row and first column in A, and 





eee 3 (6-7) 


where A is the Jordan normal form of DT as described in § 3. 


e,M-1At = [oat (l1—p) Az Vax (1 —P) Am Vina ‘ 1 


~Y 


| 





a | 





) 


ar 


ie 
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Now define R= > AV (for i = 1,2,...,n). (6-8) 


Then, as in § 4 it is easy to see that 


* [for 1 = 0,1, 2,...,(m—2)], 


A [for « = (m—1)], 


R; = (6-9) 


| ¥ (-1)*6,R;, [for i = m,(m+1),...,n], 
‘s=1 
where @, is the sth elementary symmetric function of the roots of the polynomial y,,(A) = 0. 


That is to say, the coefficients in the recursion (6-9) are identical with those of the poly- 
nomial equation y,,(A) = 0. 





Let €: MAM = (04, Vo, ..25 Um41)- (6-10) 
Then, if in equations (6-4), A‘ is replaced by R;.,,, 

o = EPs, (for 7 = 1,2, ..., m), (6-11) 

and 4, = 1 -Cor U;, (6-12) 

where U0, = aca and U,=U,_,—R,_, (for é = 1,2,...,#). (6-13) 

Finally e,(DT')* = (2, 0, v9, 9, ..., 0, Ums Umit) (6-14) 

and e,(DT)' D = [ pry, (1 —p) v4, pv, (1 — p) Ups --+5 PUms (1 — DP) Up + Uma (6-15) 
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A note on the error after a number of terms of the 
David-Johnson series for the expected values 
of normal order statistics 


Bx J. G. SAW 
University College London 


1. InTRODUCTION 


David & Johnson (1954) proposed that a finite number of terms of a certain infinite series 
could be used as an approximation to the integral involved in evaluating the expected value 
of the rth smallest of n-ordered normal variates. 

Plackett (1958) obtained a different infinite series and was able to give bounds for the 
error which resulted in using only a finite number of terms of this series. 

It is the purpose of this note to derive bounds for the error which results in using only 
a finite number of terms of the David & Johnson series. With this accomplished, the last 
section is devoted to applying both Plackett’s method and David & Johnson’s method for 
a particular example so that the two techniques may be compared. Briefly, it is found that 
the convergence of the Plackett series is a little faster than the David & Johnson series and 
that the bounds which Plackett gives are slightly sharper than those which we shall 
derive for the David & Johnson series. However, these small differences seem to be 
negligible in the light of the great computational advantages of the David & Johnson 
technique over that of Plackett. 


2. Novation 


Let 21 :n < Le:n < --» < %q:n be an ordered sample of n observations from the standard 
normal population. 
Write 1 x 
Z(x) = —— exp — 32", P(x -| Z(u)du, p,=1-—q, =17/(n+1). 


We use x(P) for the inverse function defined by 


'a(P) 
i) Z(u)du = P 


and write ‘Bela,d) = yay POML- PP, 
, t 
Ma, b) = {. (p-—*,) Bp(a, b) dP. 


M,(a,b) is thus the tth central moment of a Beta or Pearson Type I distribution and is 
hypergeometric in form. 


We will use d\! 
xP) to denote (x) x(P), 


x,(p,) to denote (35) =P) % ; 


Pr 
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3. Tue Davip AND JOHNSON EXPANSION 
The probability function of 2,.,, is given by 
1 


P(%,:n) = By,0—?+1) {P(p:n)} 4 1 — P(Sp:n)}" 7 Z(ein)s (3-1) 


+a 
so that E(Xp:n) = | P(%y:n) Lp: npn (3-2) 


= i Bp(r,n—r+1)2(P)dP. (3-3) 


We have, following David & Johnson, 


2 


! m J 
6 (Xp:n) = mT t! X4( D,) Mr, SF's 1) rs Ran. (3-4) 


R,,,, the remainder after 2m terms, is the difference between the true value of &(z,.,,) and 
the sum of the series to 2m terms. In practice, the remainder is usually found to be small 
for quite small m so that the series may be used as an approximation to the true value of 
&(x,.,). Several forms for R,,, exist. We consider a particular one which may be obtained 
as follows. 

Consider the Taylor series expansion for the function x(P) about the point P = p,; then 


2m |] 
x(P) = 2, 7 Pr) (P—p,)' + Rom: (3°5) 


Using the Darboux form for this remainder (see, for example, C. A. Stewart’s Advanced 
Calculus, p. 440), we write 


, 1 . = 
Rom = my! {, (1— 7)?" Lomii(P,+(P —p,)) dy. (3-6) 


Multiply both‘sides of equation (3-5) by Pp(r,n—r + 1) and integrate over P between zero 
and one. Using the definition of M/,(r,n—r + 1) and equation (3-3), we have exactly equation 
(3-4) with ; 
R,,, = [, Bp(r,n—r+1) Ry,,dP, (3-7) 


1 op. 
that is Rom a” ea (1 — 7)" Lom+1( Pp + ¥(P —p,)) Bp, n—Pr+ 1)dPdy. (3-8) 
(2m)! JoJo 
We shall try to give upper and lower bounds for R,,,, using the form of equation (3-8). 


4. THE DERIVATIVES 2,(P) 
We now discuss the behaviour of x,(P) and of 
Nom+a(P) = P?"*on41(P). 
It is easily verified that for j > 1 we have 
a,(P) = Z~4(x) H;(z), 


where H,(x) is a polynomial in x of degree j— 1 and which is even or odd according as j is 
odd or even. Thus we have 


H,(x)=1, H,(x)=2, H;(x)=1+22, H,(x) = 7x+625, ete. 





+1) 


+2) 


+3) 


4) 


nd 
all 
» of 
1ed. 


1en 
5) 


ced 


-6) 


ero 
ion 


+7) 


+8) 


j is 
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These polynomials obey the recurrence relation 


H,;,(%) = jxH,;(x) + Hj(x) (4-1) 
and the author (1958) has given the numerical coefficients in H;(x) for j = 1(1) 10. Further, 
it is easily shown that 2(P) = (—1)-42,(1—P). (4-2) 


Turning now to 7,(P), we have immediately that 7,(0) = 0 and 4,(P)—> 00 as P+ 1. 
For the later development we require to show that 7,,,,(P) is an increasing function of P. 
This is obviously true in the range 4 < P < 1 since 7,,,,(P) is the product of two increasing 
functions. 


Suppose then that 0 < P < } (ie. —0 < x < 0) and we require to show that 


d 
0< qp VamulP) = (2m +1) P™ Goni3(P) +P?" Lomio(P) 


pam A. x 
= FamrFop) Hamsa(®) (2m + 1) Zoe) amet 


Fan, ,2() 
= F(P)G(P) say, 
where G(P) is the expression in brackets. 


Now for 0 < P < 3, F(P) is negative. Therefore since G(0) = 0, it is sufficient (but not 
necessary) that dG/dP or dG/dx be negative in this same range. Now 


dG(P) _ — A(x) Agmsx(2) 





+?| (4:3) 











? 4-4 
dc Bale) _ 
where A gm i(®) = (2m +1) Hom y1(%) Hom s0(%) — (21 + 2) Hom 41(%) Hom +o(%) (4:5) 
or, using equation (3-1), 

A an4i(%) = (2m + 1) Hom ss(@) Homss(*) — (2m + 2) HB ni 2(a). (4-6) 


From equation (4-4) it is sufficient to show that A,,,,,(7) > 0 in the range —0 <2 < 0. 
I have been unable to show this in general; however, since the polynomials H;(x) are known 
explicitly it is a fairly simple matter to verify that A,,,,,() > 0 for given m. In fact, for 
the first few values of m, we find that A,,,,,(x) is a polynomial of degree 2m in z?, in which 
only the coefficient of x? is negative. Thus 
A,(x) = 1, 
A,(x) = 21—16a7 + 1224, 
A, (x) = 4445 — 66642? + 807624 + 13442° + 288025, 
A,(x) = 3,884,041 — 8,6 %\,072a? + 14,468,04024 + 6,808, 89625 + 24,184,65625 
+ 11,093,7602"° + 3,628,80021, 
A,(x) = 9,580,522,329 — 28,374,712,6242? + 60,246,981 ,38424 + 54,671,037,1202% 
+ 292,158,113,6162° + 337,984,717,824219 + 259,516,161,792a12 
+ 88,044,104,640x14 + 14,631,321 ,600x"°. 


In each case we note that A,,,,,(x) is positive definite since the first three terms (i.e. in 
x, x? and x) form a positive definite quadratic in x? and hence the conjecture that 72,,,;(P) 
is increasing as P increases from zero to one is proved for m = 0,1, 2,3 and 4.* Thus all 
cases likely to be required in practice are covered. 


* IT owe the suggestion of this proof to D. E. Barton. 
6 Biom. 47 
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Incidentally, since 72,,(0) = 42,,(4) = 0, it cannot be true that 7,,,(P) is an increasing 
function of P throughout the range 0 < P < 1. 


It is of interest to note that 


A,(x) Z(x) = juj(P) Xj49(P) —(j + 1) a554(P) 


and so using the complex integral form for these derivatives we have to show that 








dé dé eu ; 
{, (Pe) — Pi) | oP@)—Pay « (| o PG ar sais 


is true when j is odd for any z, but is not true for general « when j is even. 


In equation (4-7), C is to be a closed contour including only the zero of P(g) — P(x) which 
occurs at the point on the real axis at £ = x. 


5. Bounps For R,,, 


It is easily shown that &(2,,,,.) = —E&(X,_,41:,). AS a consequence we will consider only 
the cases for which p, > } and note that the results extend in an obvious manner to the cases 
for which p, < 4. 

In equation (3-8) make the substitution 


u = p,+y(P—p,) 
and we have 


‘Pr 1 1 
Ren = a - i) du [ap + | du | aP\ (P—1u)?” Lomii(u)Bp(r,n—r+1). (5-1) 
(2m) . 0 0 Pr u 
In the second of the two integrals, replacing 
u by 1-u, P by 1-P 


and using the symmetrical properties of x,,,,,(w) expressed in equation (4-2), it is seen that 
a 
R,,, = pe fi (u— P)™ Xomii(u) {Pp(n—r + 1,r)—Pp(r,n—1+1)}dudP 
(2m)! J0<P<u<g, 


- Gay [Pau [acu — Pym) Bplr,n—r+1). (5-2) 


We write J, and J, for the first and second of the integrals occurring in equation (5-2) and 


observe that both integrals are positive, since the integrands are everywhere positive (when 
q, < 4). Now 


1 P\* 1 
I= my [Io cpeecg Ia) gma) Bolm—r+1,0)—Bplrsn—r+ D}dudP (5-3) 


and so, using the properties of 7.,,,,(u) described in § 4, we have 


Nom+1(Vr) _P wa: 12 = m * 
a (2m)! a ) 7, Pp(n r+l1,r) Bp(r,n r+1)}dudP. (5 ) 


U 
ar P 2m du ‘ar P 2m Wr ste @&-?Pr 
_ f 0-5) rn < J" (1-4) wae" = m+ 1) g@mP* 





ing 


t-7) 


‘ich 


that 


5-2) 
and 


rhen 


(5°3) 


(5:4) 
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Finally, therefore 


Returning to the definition of J, in equation (5-2), since 22,,,,(w) is a decreasing function 
of uw in (0,9, < 4), we have 


Tam+1( Pr) 


> @m+ i ecreace 7? Pmt Bp(n—r+1,r)—fplr,n—r+ l)}dPdu 


so that, performing the integration over w, 








Lom+1( Pr) - = 2m+1 am = = e 
J, > Fame) [* (q,—PPmBpln—r+11)—Bolrsn—r+ dP. (66) 
For J), since x,,,,,,(v) is an increasing function of x, we have 
Lom id . i 
Jy < ae a % i. dP(u—P)" Bp(r,n—r+1), (5-7) 
J, > Pamsal) ("au [“aPw— Py Ble, n—1 +1). (5-8) 
(2m)! ar 0 


Dividing the region of integration into two parts 


Pr u ‘Pr ar ‘Pr ‘Pr 
{ du | dP = | du | ap +{ aP | du 
ar 0 ar 0 ar Pp 


and integrating out over u we have, after some slight rearrangement of the integrais, 





J, < eee Malt —r+1)+ r (P—q,)?"** {fh p(n—r + 1,r)—Bplr,n—r+ 1 4P| , 
! 0 

(5-9) 

J, > Same a (r,n—r+1)+ ” (P-¢ amt {8 o(n—1r+1,r)—fpl(r n—r+1)aP). 

2 (2m + 1)! 2m+1\" > e r 2 ? P\M> 
(5-10) 
Combining equations (5-6) and (5-9) we have after cancellation 
Lom+1( Pr) a 

2m > (2m + 1)! Momys(7,2—1 + 1). (5 11) 


Similarly, combining the_results expressed in equations (5-5) and (5-10) we arrive at an 
upper bound for R,,, which may be conveniently written in the form 


x. 
Ron, < TeSawes Pe Vamsa(", 2-7 +1: G,)—Vamyalm—1 + 1,7: Gy)} 


ll Lomi(t) 
(2m +1)! 





{Wamaal?, 2-1 +1: G,) — Womsi(n—7+ 1,7: gp) — Mamiilt,2—1+1)}, (5-12) 


where W,(a, b: v) = fr (P—v)* Bp(a, b) dP, (5-13) 
0 


V,(a,b: ») = f piP-»" Bp(a,b)aP. (5-14) 


6-2 
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6. Computation or M,(a, b), W,(a, 6; v) AND V,(a, 6; v) 
The values of M,(a,b), W,(a, b: v) and V,(a, b: v) may be generated using 


at+b+s+1 





— 
M,,s(a,b) = 5—— M,(a, b) — 4 M,-1(4,), (61) 


cs... 
(a+b) 

V,43(a, 6: v) + vV,(a, b: v) = W,(a, b: v), (6-2) 
(a+b+8) W,,,(a,b: v) = a—v(a+b) +.8(1 — 2r) W,(a, b: v) +sv(1—v) W_(a,b: v), (6-3) 


where we will define sW,_,(a,b: v) = —£,(a,d), s8=0 
= [Aeta, b)dP, s=1 
0 


a+b-1 


and Vi(a, 6: v) = ——s 





[eta—1,d)aP. 
0 


Equation (6-1) is a result which follows directly from the differential equation which 
underlies the Pearson system of frequency curves. Equation (6-3) follows using a similar 
technique, whilst equation (€-2) is obtained by simply combining the two integrals implicit 
in the left-hand side of the equation. 


- 


7. PLACKETT’S METHOD 


The essential details of Plackett’s method are reproduced here for purposes of com- 
parison. 





Plackett defines las P(x) ; 
L(x) = logy pi)" (7-1) 
Then K,(r,7), the jth cumulant of L(x,.,,), is given by 
K,(r,n) = Diflog T(r +w) Iin—r+1—w)}|,,0- (7-2) 


These cumulants are either tabulated in terms of the polygamma functions or may be ex- 
pressed as a finite series. 

Using a Taylor series expansion for 2,,,, about the point where L(z,.,,) = K,(r,) we 
obtain after integrating over the p.d_f. of z,.,, 

1 


a | . 
E(Xy:n) = je = Ky) alr) +> 1)! 


6 {(L—K,)*14,,,[L = K,+y(L—K,)]}, (7-3) 
where Bj(r,n) = E[(L(a,.,)-K,P and 2,(L)= (77) #1. (7-4) 


Derivatives and moments in the finite sum may be evaluated. Plackett notes that «,(L) 
is bounded and provides the table 


j 1 2 3 4 
max (2,;(L)) 0-62666 0-07376 0-06724 0-04597 
L 
&|L—K,|*41 = &(L—K,)*#4 (s odd), 


also 6 |L—K,|**+ < {€(L—K,)***}¢+0e+2 —(s even), 


2) 
3) 


ch 
ar 
rit 


we 


David—Johnson series for expected values of normal order statistics 85 


so that we have finally 
E (2,:n) — Dall = K,)a,(r.m) <aas (%541(L)) Agya(r, 2); (7-5) 
jn 
with Aj(r,”) = m,(r, 2) (j even), 


= (Mj4,(r, n))"4) (7 odd). 


8. A COMPARISON OF THE TWO METHODS 


To illustrate the methods, we consider the case when r = 6; n = 9. The true value of 
6 (Xg.9) 18 0-2745 2592 to eight places of decimals (see Teichroew, 1956). Thus we are able 
to give the true error after a given number of terms of either series and compare this true 
error with the bounds for error obtained from equations (5-11), (5:12) and (7-5). 


David and Johnson series after 2m terms 


Bounds for error 





ct A a 
m True error Lower Upper 
1 + 0-002662 — 0:002372 + 0-010997 
3 + 0:000668 — 0-001176 + 0-003668 
3 + 0:000127 — 0-000729 +0-001117 
4 + 0-000043 — 0000541 + 0-:000424 
Plackett series after s terms 
Absolute bound 

$s True error for error 

2 + 0-001240 0-009078 

3 + 0-001714 0-001753 

4 + 0-000580 — 


Term for term (i.e. setting s = 2m) the Plackett series seems to converge a little more 
rapidly than does the David & Johnson series. Plackett has not given max (x,(L)) above 
j = 4,80 that the bound for error after 4 terms is not given in the table above (these maxima 
are very laborious to locate although once obtained they are relevant to any r and n). It 
would seem that the Plackett bounds are a littie finer than those which were obtained for 
the David & Johnson series. 

The slight superiority of the Plackett series in respect of convergence is overshadowed 
by the computational advantages of the David & Johnson technique. If we return to equa- 
tior (3-4) we note that the derivatives x;(p,) are functions of p, only and are therefore con- 
stant with constant p,. In addition, the moments M,(r,n—r+1) may be expressed as a 
power series in 1/(n + 2) with coefficients which are functions of p, only. For example 


M,{r, e—Pr 1) ~ PU (n+ 2), 
Myr, —4 + 1) = 24-— Pr) Pr Ge DL (n+ 2)~*. 
ed 


This enables us to write E(Xp:n) = Y Hj p,, 9,1) (n+ 2), 
i=0 


where H,( p,, 0, 1) is a numerical coefficient which varies with p, = r/(n + 1) only. Once these 
coefficients have been obtained for a range of values of p,, it is a very simple matter to 
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calculate &(x,.,,) for any values of r and n. David & Johnson have given tables of the 
algebraic forms of the coefficients which occur in the expansion of K(x}, 2?» Xé:» 4.) in 
terms of a power series in 1/(n + 2), fora+b+c+d < 4, as far as the coefficient in (n + 2)-. 

The author (Saw, 1958) has given the numerical values of the coefficients H,(p,, a,b) 
which occur in the expansion of the integral 


[ (FEN een Aoten—r+ dP = 3 Hipp a,b) (+2), 
0 i=0 


for p, = 0-50 (0-05) 0-80; i = 0(1)5;a+5b < 4. 
On the other hand, the coefficients in the Plackett series are functions of K,(r,n). Now 
using equation (7-2) 


1 1 1 
K = oo doe Poe 
lr, ”) m—r+l$n—r+2 r—1’ 





which does not remain constant as some function of r and n remains constant. Hence we 
need to calculate a new Plackett series for every combination of r and n. 


I wish to record my gratitude to Dr F. N. David, Dr D. E. Barton and Dr C. L. Mallows 
for their encouragement in carrying out this investigation. 
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More significance tests on the sphere 
By G. 8S. WATSON} 


University of Toronto 


1. INTRODUCTION 


It has been plausibly suggested (Fisher, 1953; Watson & Irving, 1957) that the directions 
of remanent magnetization of specimens from the same source of suitable rock are distributed 
on the unit sphere according to the formula 


akc exp (K cos @). (1-1) 
Here cos@ = Al+ym-+vn, (l,m,n) are the direction cosines of an observed direction, 
(A, 4, v) are the direction cosines of the mean direction, and « is an accuracy parameter, large 
values of which lead to small dispersions. This paper is primarily concerned with a question 
which is sometimes of geophysical importance: are the mean directions of a set of populations 
all in the same plane? 

The basic distribution theory of (1-1) was given by Fisher (1953). For a sample of N 
from (1-1), the maximum likelihood (M.L.) estimator or (A, , v) is the set of direction cosines, 
(l,m,n) say, of the vector resultant of the sample of unit vectors. If the length of the 
resultant is denoted by R, then the M.L. estimator of x, when (A, 1, v) is also estimated, is the 
solution of 


| eee 
eothk—7 = 5. (1-2) 
When N is large and £ is near N, this has the approximate solution 
N-1 


When (A, 4, v) is known, the M.L. equation for « is the same as (1-2) with R replaced by 
X = Reosy, where y is the angle between the resultant and (A, 4, v). The approximate 
estimator k, of x is then given by N 


ky => bese . 
It may be shown (Watson, 1956) that, for large x, 


(1-4) 





2x(N —R) © Xon-v- 
These results and a comparison of the identity 
D(e,— mw)? = L(x, H+ N(E—p), 
with 
2«(N —X) = 2x(N —R)+ 2x(R—-X), (1-6) 
leads to the suggestion that 
R-X 
w-) 5-2) 4 2, 2N—1)? (1-7) 


+ This work was carried out while the author was visiting Princeton University and prepared in 
connexion with research sponsored by the Office of Naval Research. 
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be used as a test of a prescribed mean direction. Further tests were derived by the same 
argument in Watson (1956). The same approach yields the procedures for analysing the 
coplanarity of mean vectors that will be given below. However, it is of interest to deduce 
them by likelihood-ratio methods. It will be apparent then that the above tests may be 
derived by this method too. 

The generalization of the foregoing to unit vectors in any number of dimensions was 
indicated in Watson & Williams (1956). The same is easily possible here. A brief review of 
studies of the dimensionality of the mean vectors of multivariate normal populations is 
given by Anderson (1958). For large samples a satisfactory solution based on chi-square is 
possible but, in small samples, difficult problems are posed by nuisance parameters. The 
distribution (1-1) presents the same difficulties. 


2. A TEST FOR COPLANARITY 


Consider p populations (1-1) with mean vectors (A;,), 4, ¥)) (¢ = 1, ..-,p) and all with the 
same accuracy parameter x. Suppose that, for i = 1,...,, a sample of N; has been drawn 
from the ith population and that it yields a resultant with direction cosines (1;,m,,n,;) and of 


Pp 
length R;. Writing > N; = N, the logarithm of the likelihood of the entire set of observations 
i=1 
is proportional to 
p 
N (log x —log sinh x) + KY B,(L, Ag +m M@ +N M)- (2-1) 
i=1 


If a known direction (A, 1, v) is orthogonal to all the population means, it is easily shown 
that the M.L. estimator of (A,), 4), % ) is given by 





(2; — (ZAL,) A, m; — (ZAL,) w, n; — (ZAL,) v] 
Vil-(2aL,*7 0’ 


where XAl; = Al;+um;+vn;. Under the same conditions, the estimator of « is given by the 
solution of 


(Aw, fu), Dw) = (2-2) 


, SRW (aly) 
a= = V _ (2-3) 





A comparison with (1-2) shows that =R; ,/[1 —(ZAl,)"] takes the place of R. If no restriction 
is placed on the mean vectors, then (1;,m;,,;) is the M.L. estimator of (Ay), 4q), %)) and the 
estimator of x is the solution of 

ER, 


i=1 


1 
coth x* “a = NV ° (2-4) 





To test the null hypothesis that all the mean vectors lie in a plane normal to the prescribed 
vector (A, ,'v), we form the logarithm of the likelihood ratio in the usual way. On the null 
hypothesis, (2-1) becomes 


N (log k —log sinh) +® > R, {1 —(ZAl,)*], (25) 
i=l 


the 


2-3) 


jion 


the 


2-4) 


bed 
null 


2-5) 
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and since we are going to assume that N and « are large, (2-5) reduces to 
Nilogk —1), (2-6) 


N 
NER, J[1—(aL)*)" 


On the alternative hypothesis (2-1) again takes on the form (2-6) but with «* instead of &, 
where 


where k= 





(2-7) 


pO st ee (28) 
Thus, by Wilks’ Theorem, 
—2N log (k/K*) ~ x3. (2-9) 
Then (2-9) may be written as 





| 5 R-[1- — 
2N log |1+ TNR) ~ Xp 


When all the N; are large, this may further be approximated by 


ER(1— [1 — (BAL,)*) 





2N = ______ ~x%; (2-10) 
Paes ex R;) 
or finally as 
A’UA 
Ns —R) ~% dale 


where A’ = (A, uw, v), and 
xR, 12, XR lm; UR;1;,n;, 
U= 


ZRm,l;,, URym?, UR,m,n, (2-12) 
rR, nl, URnym,, UR;n?3. 
Had the analysis of variance analogue been pursued, it would have suggested that 
N- A’UA 
P ‘ X(N, —R,) ~ F,xNv-p»> (2°13) 


which for large N is the same as (2-11). Thus (2-11) or (2:13) provide two approximate tests 
of the hypothesis that a set of p mean vectors lie in a plane normal to A. 
The same likelihood-ratio procedure gives, as a test of coplanarity only, the statistic 


mina’UA 
a 


NS(Ma—R) ~ %-* iis 
The analysis of variance analogy gives 
We min Aa’UA 
or SUN, R,) ® Fy-2,an—pr- (2°15) 


In this case the value of A minimizing A’UA is clearly the latent vector of U corresponding to 
its least latent root and this vector is the M.L. estimator of the normal to the best fitting plane. 
The tests (2-13) and (2-15) can be conveniently arranged in an analysis of variance table, as 
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is shown in our example. It is natural in practice to test first for coplanarity and then to test 
the adequacy of a prescribed normal. This latter test is either of 
[A’UA—minA’UA] 


N x ~ x3 2-16 
X(N; — R,) X2 ( ) 





or 
A’UA—mina’UA 
(N ~p) cea 
2 N-R) *N 
These tests provide a confidence cone for A—the set of all non-significant A. This will be 
illustrated in our example. 


A procedure which should be more robust against differences between the accuracies in 
the populations is provided by 





(2-17) 

















y= (Bal) ¥ 22. (2-18) 
Writing s 
GD Afdim, a8 
W= Ea mle SB, mi, mi =, M,N, (2-19) 
Ea ial ee mm, D>» ee nt, 





this leads to an approximate analysis, 


a 


A’Wa — min A’WA = ‘Deviations from assumed normal’ ~ y3. 
2 


min A’WA = ‘Devia’ ons from coplanarity’ ~ x%_», | 
(2-20) 


3. AN EXAMPLE 


A part of a larger collection of data (kindly supplied by Dr K. M. Creer) whose analysis and 
geophysical significance will be discussed elsewhere, may be used to illustrate the foregoing 
procedures. 

Samples were taken from three populations whose mean vectors were, on a certain 
supposition, coplanar. The table below gives the sample sizes, the direction cosines and 
lengths of the sample resultants and the within-sample estimates of «. These figures contain 
all the relevant information in une data. 


Popula- 
tion l m n R N k 
A — 0-0698 — 0-9589 — 0-2750 33-172 35 18-6 
B — 0-8572 0-2575 0-4460 8-567 9 18-5 
C — 0-5469 — 0-7302 — 0-4095 5-786 6 23-4 


The matrices U and W and their latent roots and vectors were computed electronically by 


Dr Creer. The least roots of U and W are respectively 0-02682 and 0-5900. Hence 
2(N —p) }min2’UA 0-02682 _ 
—a a oa 


is to be referred to the F’-distribution with 1 and 94 D.¥F. The other root 0-590 is to be referred 





st 


be 


9) 


0) 
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to the chi-square distribution with 1p.r. The close agreement of the two statistics is a 
reflexion of the very similar values of k in the three populations. There is no evidence of 
significant non-coplanarity. 

To put a cone of confidence around the normal to this common plane, it is natural to use 
the latent vectors as new co-ordinate axes. In this case (2-15) may be written, for 95%, 
confidence, 

}(47) (9-9497A*? + 37-5481 u*2 + 0-02682y*? — 0-02682) < 3-05. 
Writing v* = cos@, A* = sin@ cos ¢, u* = sin @sin ¢, this becomes 
sin? 6(9-9497 cos? ¢ + 37-5481 sin? d) + 0-02682 cos?@ < 0-103. 
The cross-section of this cone is elliptical. The average semi-angle 0 is given by 
23-7489 sin? 6 + 0-02682 cos? = 0-103 


so that 0 = 3°15’. However, 0,,,,.., Which occurs when ¢ = 0°, 180°, is 5° while 6,,;,,, Which 
occurs for d = 90°, 270°, is 2° 35’. Since the latent vector of U corresponding to the minimum 
root is 
(0-4802, 0-2004, — 0-8540), 

the 95 % cone of confidence for the normal to the plane in which the three population means 
lie has this latent vector as axis and a semi-angle between 24° and 5°, with a mean of 3°. 

The cone of confidence obtained from the weighted analysis using W is very similar. For 
example, the cosine of the angle between the axes of the two cones is 0-9999927 so that the 
angle is less than 6’, 
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An approximation to the multinomial distribution: 
some properties and applications 


By N. L. JOHNSON 
University College London 


1. When investigating the properties of statistics constructed from discontinuous random 
variables, it is often useful to have a reasonably good approximation in terms of continuous 
variables. The approximation to the distribution of the contingency table x? statistic by 
means of a continuous Pearson Type III variable is a well-known example. In searching 
for such an approximate representation it is a common practice to try to arrange that 
certain important properties of the original variables—range of variation, and/or lower 
order moments—shall be preserved. Most approximations of this type have been worked 
out for single variables, but there is no reason, in principle, why similar approximations 
cannot be used for multivariate systems. 

In this paper the properties of an approximation to the multinomial distribution 


k k k 
P(ny, Ng ,.--,N,) = N! TL (aPi/n;!) (=, =1; >, = ) (1) 
j=1 1 1 


will be. discussed in some detail. The approximation is obtained by regarding the relative 


frequencies f; = n,;/N as continuous variables with joint probability density functions 
derived from 


k 

I fr» for fx) = P(N - 1) TT [fj* 94 /P((N — 1) 77;)). (2) 
The joint probability density function of any k — 1 out of the k f’s is obtained by replacing 
the missing f, by 1 ES in 9(f15 fos ---» Fx) 


This approximation gives the correct first and second moments and product moments for 
the joint distribution of the f’s, and also the correct range of variation. It has been used by 
Johnson & Young (1960) to derive approximate distributions for the range, (max n; — min n,), 
and for max n,. Some other applications are discussed in the present paper. 


2. From (2) it follows that the general product moment about zero, 


: k 
Mf) = Mays, cia, .005 0h (f) =€ (11 fi) 
T(N —-1) K T\((N —1)7; +05) 





is equal to 


or, equivalently, 


: N™T(N-1) *FT((N-1 
Mal) = bey, ap oar (™) = 6 (ii ngs) « (N —1) ((N — 1) 77; +0) 


p(w-14+¥0,)" T'((N —1)7;) 


The exact multinomial factorial moments of the n,’s, based on (1), are 


k k 
Hol) = Ho 04..00(%) = & (HL nf?) = NOD TI nfs (a) 
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Considering now the first- and second-order moments and product moments, both (3) 
d (4) gi 

and (4) give 9) 9; var(f,)=2,(1—m)/N; cov(fyf;) =—mm|N. 


This agreement between approximation (2) and the multinomial distribution (1) does not 
extend to the higher moments. The table below compares the values of ,/f, and £, of f; for 
the two distributions Exact (1) 
VA, (1—2m,)/[Na,1—7,)} 
fb, 3+[(1—67,(1—7,)]/[Naj(1—7))) 


Approximation (2) 
VB, [2N/(N +1)] (1—2a,)/[Na(1—7,)}¢ 
by 3N(N—17)/(N +1) (N+2)+[6N2/(N +1) (N+2)] [1—7,(1—7,))/[Na,(1—m)] 


The approximation gives skewness twice as large as the exact value, and kurtosis nearly 
six times as large. As we shall see in the next section, however, these disagreements are not 
reflected in any serious discrepancy in calculated probabilities. 

We may note that the exact value for &(f?f;) is 


(N —1)2,7,{(N —2)7;,+ 1) N-*, 
while (2) gives (N —1)a,7,{((N —1)7,+ 1] NN +1)7. 
The expected value of f;* given by (3) is 
(N — 2)/[{(N — 1) 2; —1). 


In fact, f;* has an infinite expected value since Pr{f; = 0} > 0. However, if the distribution 
of f; be truncated by omission of the value f; = 0 the expected value is finite. Such dis- 
tributions (or, equivalently, those of n;) have been studied by Fieller & Hartley (1954) and 
David & Johnson (1956) inter alia. The latter authors obtained an expansion of the form 


a. = l—m,;  (2—7;)(1—7;) 
E(f; y= > [r+ Nn, ae? + |. 


Expanding the value (N —2)/[(N —1)7;—1] derived from approximation (2) in powers 


of N-1 we get 
a l—m, (1+7,;)(1—7;) | 
a j j j 
E(f; y= [t+ Wet tay ty}: 


The two expressions (up to terms in N-*) agree to order N-! and differ only by 
(1 —2m;) (1—7,)/N27}. 
3. From (2), the approximate joint probability density function of f,, f., ..., f,1 is 


_ TW-}) R-1\(N-Ym;-1k=1fLN-Dajt k-1 
PS i> for ---s Sea) = rave ~ fs) us T(N—1)z,) x f; < 1,0 <f) 





7 








j=1 j=1 
(5a) 
Integrating out f,_, the approximate joint probability density function of f,, fo, ..., f,-2 is 
T(N -1) ( k-2 ye _ytm)-1k-2 fray 
j 9J22+++9 J R—2) = 1- >} =. 
Mofo ob)" TN1NeatmN Pa HM, P(W= 17) 


(= 4 < 1,0 <I): (5b) 


(3) 


not 
: for 


arly 
not 


vers 


fy) 


5a) 


2 18 


5b) 
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Repeating the process we find 
T(Vv-1 8 \N-DEaj-1 s ((N-I)m;-1 / 8 
Pfr» Se» -++> Fa) =- ( et (1-3 fs) sti / II Pea Xs 4 1, f; > 0), 
r(w- 1)> n,) iis jot j) \=1 wa 


T(N -1) 
I)m,) F(N—-1) (1—7,) 





and P(f;) os rw Sore — fray (0 < fj <1). (5d) 


We see that the approximate distributions of subsets of the f’s, derived from (2), are of 
the same form as the initial function g(f;, fo, ..., f.). The approximation is thus self-con- 


sistent in this respect. 


Table 1. Type I approximation t- the binomial 




















# = Iq_yjy((N —1) 7, (N—1) (1—m)) yay gN—P 41,1) 
(approximate values) (exact values) 
nm=01 a=03 n= 0-5 
ct — oe al a ~ = A” ‘ 
r x y x y v y 
N = 25 

1 0-0359 0-0717 — 0-0001 — -— 

3 0-5728 0-5371 0-0043 0-0090 — — 

5 0-8959 0-9020 0-0880 0-0905 0-0002 0-0002 

7 0-9828 0-9905 0-3558 03407 0-0065 0-0073 

8 0-9936 09977 0-5239 0-5118 0-:0214 0-0216 
10 0-9994 1-0000 0-8083 0:8105 0-1188 0-1147 
12 1-0000 — 0-9449 0:9557 0:3488 0-3450 
15 — — 0-9974 0-9982 0-7825 0-7878 
18 — — 1-0000 1-0000 0-9786 0-9784 
20 — — — — 0-9985 0-9980 

nm=0-1 n=0-3 n= 0-5 
c A a ° ey *” | ct “a ‘ 
r x y x y x y 
N = 50 

1 0-0001 0-0051 — — —— —- 

3 0-0999 0-1117 — — -—— os 

5 0-4559 0-4312 —_ 0-0002 — — 

7 0-7791 0-7702 0:0012 0-0024 — = 
10 0-9676 0:9754 0-0361 0-0402 —- = 
12 0-9933 0-:9968 0-1399 0-1390 a — 
15 0-9996 0-9999 0:4574 0-4468 0-0011 0-0013 
17 1-0000 1-0000 0-6878 06838 0-0073 0-0076 
20 --- — 0-9118 09151 0:0604 00595 
22 — _ 0-9716 0-9749 0-1635 0-1611 
25 — — 0-:9968 0-9976 0-4446 0:4439 
27 — — 0-9995 0-9997 0-6622 0-6641 
30 — —_ 1-0000 1-0000 0-8970 0-8987 
32 — — — — 0-9675 0:9676 
35 —_ — — — 0:9970 0-9967 
37 — — — — 0:9997 0-9996 
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From (5d) Pr{n; < rj} = L,_pyw((N —1)7;,(N-1)(1—7;)) (7; > 9), (6a) 


x 
where I,(m,n) = fae |, aml] —x)"-1dx 


is the incomplete beta-function ratio. The exact value of this probability is 
Pr{n; < 73} = L_,(N—1;+1,7)). (65) 


The work of Davies (1933) on approximating the hypergeometric distribution F(a, 8; y; 1) 
by a Type I curve indicated that Type I should also give a good approximation to a binomial. 
This was confirmed by Benedetti (1956). 

Approximate and exact values calculated from (6a) and (6b), respectively, are compared 
in Table 1. Benedetti (1956) gives corresponding values for N = 100. The comparison in- 
dicates that approximation (2) is reasonably good, over the bulk of the probability mass, in 
regard to the marginal distributions it implies. 

We have already seen, in § 2, that the approximation is not exact for moments of order 
higher than two (or less than zero). In §7 we will also show that it is not self-consistent in 
regard to conditional distributions (of one subset of f’s, given another subset). 


4. From equations (5) we can obtain the distributions of sums of subsets of the f,’s by 
the usual processes of probability calculus. We find, for instance, 


ee ee 
a a s+1 





(0 < fyi Zh < 1). (5e) 


This is of the same form as p(f;) in (5d) with 7; replaced by > 7;. Since 
1 


Si (Ea) 
1 1 
is the proportion of observations falling in the superclass formed by combining the first s 


classes of the original multinomial, the approximation is self-consistent in this respect also. 


Equation (5e) (and similar equations) may also be obtained by the following instructive 
analogy. 


Let v,,V2,...,0, be mutually independent random variables distributed as x? with 


Vy, Vg, -.., Vy, degrees of freedom, respectively. The joint probability density function of the 
v’s is then 


k 
P04, Vg, «-+5V,_) = au [2-4 {T(4v,)}1 vit e-ty] (0 < 9). 
Now apply the transformation 
k k 
y;,= vif u% (@=1,2,....k-1); y= UY, 
j=1 j=1 


k-1 
so that %=4Yy, (@=1,2,...,k-1); y= n(1- p > 1). 
j=1 


6b) 


5e) 


st s 
lso. 
jive 


rith 
the 
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The Jacobian of this transformation is 
O(04, Ug, ++, Ux) k. ‘| a ) 
~ = yt = 
BUY -%n) a* 
and so the joint probability density function of y,, ys, ..., y;, is 











yh e-tu: r(! )( k-1 ‘ae ed 


PUY 1» Yas +++ Ya) = ke es a 1- Dy Il 97’ 
2 (4 E05) Hrdy,) ; 
1 


1 


/ k-1 
Integrating out y;, we find the joint probability density function of the (k—1) ratios 
Yr Yas -++> Yr 





k 
P(e ») k-1 $v,—-1k-1 j k-1 : 
PUY Yor +++) Yea) = = ate (1- ~ ) Ue (0 S Yj; a Yi i), (5a) 
MG V;) ie 


which is of exactly the same form as (5a), except that v; is replaced by 2(N —1)7; for each 
j =1,2,...,k. (It may be noted that Pearson (1923) states that L. N. G. Filon had, in 1901, 
suggested (5a) as a possible ‘multivariate frequency surface’ for the joint distribution of 
a number of Pearson Type I variables. The same distribution was obtained by van Uven 
(1948) among his bivariate extensions of the Pearson system of curves.) 

Hence we may regard our approximation to the joint distribution of f,, fo, ..., fj), as 
equivalent to the replacement of these variables by 


k k k 
ii = [Sopa = Pi YE = nf So 
respectively, the v’s being independent x? variables, v; having 2(N — 1) 7; degrees of freedom. 


Equation (5e) then follows om the additive property of independent ’’s since » v; is 
distributed as y? with 2(N —1) x 7; degrees of freedom. 


5. The analogy described in the last section is useful when forming approximations to 
the distributions of certain statistics constructed from multinomial variables. 

If the classes of the multinomial distribution are assigned fixed numerical values 
X,,X_,...,X;, we have a model which can apply to a frequency distribution (possibly 
grouped). The distribution of the rth crude moment 


k 
m, = & $,X5 
j=1 
would then be approximated by that of 


($09) ($9) 


i.e. the ratio of a weighted sum of independent ’s to the unweighted sum of the same y”’s. 
~ Biom. 47 
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In the case when the degrees of freedom are integers (i.e. 7; isan integer multiple of $(N — 1)-*) 
the ratio is of the form w-1) 1) 
("> a) [CS w), 
r=1 r=1 


where %, Ue, ..., Ugy_y) are independent unit Normal variables and the A’s are constants. 
The distribution of random variables of this type can be investigated by the methods used 
by von Neumann (1941). 

Other consequences of the analogy are 

(a) the ratio of the frequency in the ith cell to the frequency in the jth cell is approxi- 
mately distributed as (7;/7;) x (F with 2(N —1)7;, 2(N —1)7; degrees of freedom). 

(6) in the symmetrical case 7, = 7, =... = 7, =k the distribution of the ratio 
(max f,)/(minf;) = (max n,)/(minn,) is approximated by that of the ratio si,,x/8inin, for 
which some percentage points are given in Table 31 of Pearson & Hartley (1958). 

(c) In the symmetrical case of (b) the maximum relative frequency max/f; has a dis- 


k 
tribution approximated by that of Cochran’s criterion s?,, / > 83, for which some percentage 
1 


points are given in Eisenhart, Hastay & Wallis (1947). 
Some numerical investigations of (b) and (c) are given in Johnson & Young (1960). 
(d) The analogy does not lead to helpful results in the case of the statistic 


k 
x= & (N/a) (fm) 

which is used to test the hypothesis Hy: 7; = 7} (j = 1,2,...,k). We have 

k 

3 (Vin) 09 

des Ws 
(3°) 
1 

which is not easily handled analytically. 


The likelihood ratio test of H) with respect to the alternatives that 7; + ny for at least 
two values of j leads to the criterion 


—N, 


k 
M =-2N- PEL log (f;/773) 


k 
= —2(N— 1) a 14(f;/7$) log (f;/79). 


Large values of M are regarded as significant of departure from the hypotheses H,. The 
distribution of / is not easy to estimate, even using our approximation. In § 6, approximate 
values of 


: 
Hualflogfs m) = & (II f70o8/,)) 


(using an obvious notation) are obtained. These could be used to find, approximately, the 
moments of M and so to approximate to its distribution and to the power function of the 
test based on M. 

Here we will only mention an alternative approach, using our analogy, which appears 
to be not unreasonable, applying the Bartlett-Neyman—Pearson test for equality of 


1)~*) 


least 


y, the 
of the 


pears 


ity of 
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variances (Neyman & Pearson (1931), Bartlett (1937)). The analogous criterion, in terms 

of the f;’s, is . 

M' = —-2(N-1) x m4 log (f;/7$). 
3 = 


Large values of M’ are regarded as significant of departure from the hypothesis A). 
From Bartlett (1937) 1 k |] 
us l * @(k—1) (W a my i 


is approximately distributed as x? with (k—1) degrees of freedom if H, is true. 
M’ is strictly appropriate to testing the hypothesis H, against the alternatives (corre- 
sponding to unequal variances, but the same degrees of freedom) specified by 


T(N — k 
Ufa» for --sfe) = qo) — fee? (0 < fyi Bef, = 1) 2)" 
Tl P(N —1) 79)" ' 
j=1 


rather than 7; + 7}, as stated above (which corresponds to changes in the number of degrees 
of freedom, the variances remaining the same). Nevertheless, the test based on M’ should 
be fairly powerful with respect to hypotheses of type 7; + 7} also, since the main effect, in 
both kinds of alternative hypotheses, is to change the ratio of the expected values of the f,’s. 


6. We will suppose the values f; = 0 omitted from our distributions; then we can con- 
sider approximations to the expected value of 


k 
TL Ufplog,,)") 


We first note that the joint characteristic function of the k random variables log f, is 
approximately 


logs) = & {exp (i3t,loe/,)} 


A 
~ (ii) 
j=1 
T(N-1) — & P(N-1)m,+it,) 


= —_ — 2 3" from (3). 
P(w-14i834)" I'((N —1)2;) 
1 


Hence, taking logarithms and differentiating, we have, for the cumulants of log f, 


K,(log f;) = ‘¥°-%((N — 1) 7,;)—¥e-(N — 1), (7a) 
dst 
where PO(r) = atti log T(x) 


is the (s + 2)-gamma function of x. 
For the cross-cumulants of log f;, and log f;, (j, + je) we obtain 


Kp, log fj,» log f;, > ‘ry Patr-Y(N =a 1), (7 b) 
and generally Krrp,....ri HOR Si» log fs, veg log f;,) = PRN = 1), (7 c) 
k 
where R= > 71; 


j=1 
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Using equations (7) an approximation to any product moment 


8 (TI (ogs,)) = wiles: m) = 1,a(frlogs: m 


(extending the notation introduced in §6) can be found. In a number of cases explicit 
formulae have been obtained but they are not given here. 


7. We have already seen (in §2) that approximation (2) does not give correct multi- 
nomial moments of order higher than two. In this section we conclude our paper by noting 
two further types of difference from the exact model. 


(a) MAXIMUM LIKELIHOOD ESTIMATORS 


The maximum likelihood estimator of 7; is f;. If we use (2), which implies (5d), the 
maximum likelihood estimator 7;, of 7,, satisfies the equation (in which V(x) = ‘Y(x)) 


—(N —1)(¥((N — 1) 7;)—¥((N — 1) (1—4))] + (W — 1) log [f/(1-F;)] = 9, 


ice. log 4 = Y'((N —1)7,;)-—'¥((N —1) (1—4;))) 
‘ (N —1)7;—-4 
+i 
| os ay agra 
Hence f{1—7;—H(N —1)] = (1—-f) [4 -4(N -1)7] 
—s 
and a; = fj;+ =D’ 


j—J; varies from 1/(2N) (approx.), when f; = 0 to —1/(2N) (approx.) when f; = 1. 


(b) CONDITIONAL DISTRIBUTIONS 


The self-consistency of approximation (2) as exhibited in equations (5) does not extend 
to the conditional distributions of some f’s, given others. Dividing (5a) by (56) we get 


D(N-1) (atm) (Sea 


(N—1)7;,_,-1 
P( fies lfis Se, sea Sea) 5 r(N— 1) 7,_ 1) (Nv — 1) 77;) (e) 


x fl oe roree so ’ 
i— = fj a ~ fj 


whence, making the transformation 
k-2 
Sra =fis|(i- p> f, , 


; — P(N 1) (my _1+7)) 
P( Sica | far fes +++ fea) = r(v— 1) 7-1) TUN ein ) 


FMV a —f_ ome 


(0<frr <1). (8a) 
This means that, the conditional distribution of f;,_, is of form (5d) (with j = k—1) with 
1,1 replaced by 7,_,/(7,,-1+7,) and (N — 1) replaced by 


k-2 
(N —1) (7,4+7,) = ww—1)(1- ~ n,) 


_— Se 
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k—-2 k-2 
(ie. N replaced by N (1 -> n,) +> n,) . For self-consistency N should have been 
1 1 
replaced by id 
N(1-'5 f). 
ues ( EL 
(k—2 
(Since var ( b fi) = (7),_4+7;,) (1—7,_4—7;,)/N 
wulti- | » 
ting the difference between the two new values of ‘N’ will ‘usually’ be of order ./N.) 
Similarly, in general 
P(w-1)(1-3m,)) 
Mths haihshy >to, At. met Te Cee ieitGe 
on P (W=1)(1- Sq-m) IEC 1) 4) 
/ s (N-1)Q-Ea1q,—Zmp)-1 
~ f. a s f (N-1) Mq,—1 
« (1-4 — n(—4 
= ~ So, : ‘— Uo 
( x 1 t 
—t \8 : os 
| (ny (<tfmer-Bs) 
Making the transformations 
| / t 
| = Saf (1- Zhu) = 12,9) 
[ we find 
s t 
(w- 1) (1 - %m,)) 
tend | ae eee “| ey ey SE : eee see 
t P ((W=1)(1-34q-Em)) Th r(NV —1)7,)) 
8 (N—1)(1—EZ1q,—Ep)—-1 8 ‘ 
x (1-34) Ifa 
1 1 
5 8 
[ (0 «fa; fa < 1). (8b) 
Here again, the relative probabilities 
t 
ral (1-26) 
are correct, but (8b) implies a nominal ‘sample size’ equal to 
t t 
(8a) N (1-Em) +E, 
with ? 


t 
instead of the value N (1 ->Yf s) 
1 


required for self-consistency. 
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The most economical binomial sequential probability 
ratio test 


By M. K. VAGHOLKAR anp G. B. WETHERILL 
Bombay Teaxtile Research Association and Birkbeck College, London 


1. INTRODUCTION 


Batches of items are presented for acceptance inspection, in which the items must be 
classified as either effective or defective. Let p denote the fraction defective contained in 
a batch. We assume that a critical fraction defective p, exists and is known, such that a 
batch with a fraction defective p, may without loss be accepted or rejected. We call batches 
with fraction defective p < py good batches and these should be accepted. Those with 
fraction defective p > py are termed bad batches and should be rejected. Let P(p) denote 
the prior distribution (also called the process curve) for the fraction defective contained in 
batches submitted for inspection, and suppose this is of the following type 


Pr{p=p,}=a,, where a,+a,=1, 


1 
Pr{p=p}=a, and p, < po < Po ) 


Let W,, denote the decision cost incurred if a batch with fraction defective p, is rejected 
and let W,, denote the decision cost incurred if a batch with fraction defective p, is accepted. 
Let c denote the cost of inspecting a single item. The problem is to determine the most 
economical sampling inspection plan. 


2. THE SEQUENTIAL PROBABILITY RATIO TEST 


By assuming a special form of prior distribution, we have reduced our problem of accept- 
ance sampling to that of testing two simple statistical hypotheses W%, (i = 1,2), where %; 
means that a batch comes from a population with a fraction defective p, (i = 1,2). We wish 
to accept a batch if H, holds good, and if %, holds, we wish to reject the batch. 

In a sequential sampling inspection plan one item is inspected at a time, and the inspec- 
tion is stopped as soon as sufficient evidence is obtained in favour of either of the hypotheses. 
Therefore as long as the cost of inspection depends merely on the total number of items 
inspected and no extra overhead cost is involved on account of sequential sampling, a 
sequential sampling inspection plan will be the most economical one. Moreover, Wald & 
Wolfowitz (1948) have shown that, when testing two simple hypotheses, of all tests with the 
same power, the sequential probability ratio test requires on average the fewest observa- 
tions. Here this means that of all the sampling inspection plans with the same or lower 
decision costs, a plan based on the sequential probability ratio test will have the minimum 
cost of inspection. The usual sequential probability ratio test will therefore be the optimum 
test procedure, and this holds as long as the cost of inspection is linearly related to the 
number of items inspected. Below we give the method of arriving at the optimum test 
procedure, based on the basic theory given by Barnard (1954). 
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3. EQUATIONS FOR THE OPTIMUM TEST PROCEDURE 


The optimum test procedure will be of the following type: 
(a) continue inspection as long as 


Ay <A = FUa,y) < A; (2) 
a, 


(b) stop inspection and accept the batch as soon as A > A,; 

(c) stop inspection and reject the batch as soon as A < Aq; 
where x and y denote respectively the number of effectives and defectives obtained at any 
stage, and I(x, y) is the likelihood ratio and is equal to (p,/p.)” (¢,/¢_)*. It is more convenient 
to write inequality (2) as L < (x,y) < U, (3) 


where U = a,A,/a, and L = a,A,/a,, and similar rules for acceptance and rejection apply. 
Let A(U,L;p) denote the probability that a batch with fraction defective p will be 
accepted; 
R(U,L;p) denote the probability that a batch with fraction defective p will be rejected; 
and S8(U,L;p) denote the average number of items required to be inspected. 

In order to obtain equations for the optimum boundaries we proceed in the manner 
indicated by Barnard (1954). The decision boundary is the locus of points such that the 
expected cost of taking an immediate decision is equal to the expected cost of taking at 
least one more observation and continuing the test. For example, when A = A,, we write 
down the expected cost of accepting, as a function of the prior probabilities and decision 
costs: if we take another observation, then an effective item leads to accepting the batch, 
incurring a decision cost if W, is true, and a defective item will take us to a point where we 
either continue the test or reject immediately. 

We determine A, to be that value for which these two costs—of accepting immediately, 
and of taking one more observation and continuing the test if necessary—are equal. 

It should be noted that in general the difference between the expected cost of an im- 
mediate decision and the expected cost of continuing the test should decrease as the sample 
size increases. 

For any A, the posterior probabilities associated with the hypotheses 7%, and #, are 
A/(1+A) and 1/(1+A), respectively. We are now in a position to derive the equations for 
the optimum values A, and A,. 

Consider A = A,, so that the probability that /, is true is A,(1+A,) and the probability 
that #%, is true is 1/(1+A,). Now the immediate decision to be taken is to accept “,, and 
we shall incur a loss W,, if W%, is true. The loss due to an immediate decision is therefore 
W,./(1+A,). Further if we inspect one more item, incurring a cost c, and continue the 
procedure thereafter, we shall accept the batch if the next item is an effective since 
Ai%/2 > Ay. The cost W,, will be incurred if %, is true, so that the loss is 


W,2 x {prob. #, true} x {prob. next item effective | #, true} = Wy2q2/(1+A)). 
If the next item is a defective, the likelihood ratio will be A,p,/p., and we shall continue 
with a (new) sequential likelihood ratio test starting at this point. If the number of effec- 
tives and defectives obtained with this new test is (x’, y’), the test is 
Az < Ap U2", y') Pe < Ar 
or As Pe 


, , Pe 
< iz’, <=. 
Ai Pi 9) Pi 


(2) 
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Clearly if A,/A, > p,/p, we shall reject immediately, so that sampling continues only if 
A,/Ay < P1/P_. The cost due to this continuation is 


[oS(a, po; py) + Wa, R(x, pax; py)] pyAq/(1 +4) 


if #, is true and p < p,/p., where « = p,/p,, and p = A,/A,. If #, is true and p < p,/p. 
the continuation cost is 


[cS(a, px; py) + Wig A(x, pa; Po) ]po/(1 + Aj). 


If p > p,/po, we reject the batch, and the cost is W,,. p,.A,/(1+,). 
We now equate the expected cost of an immediate decision to the expected cost if at least 
one more item is inspected, and we have, 


Wya/(1+Ay) = ¢+[eS(x, pa; py) + Wy, R(x, poe; py)] prAa/(1+Ax) 
+ Wyo9o/(1+Ay) + [oS(a, poe; po) + WypA(H, px; P2)]Pa/(1+Ax), (4) 
when p < p,/p2, and for p > p,/po 
Wyg/(1 + Aq) = ¢ + Way Pi Ag/(1 + Ag) + Mee Ge/(1 + y)- (5) 
If we use the fact that A(U,L; p)+R(U,L; p) = 1 and simplify, equations (4) and (5) 


reduce to 
ae al WyoPoR(x, px; py) —¢ — pycS(%, pa; Pa) 








a ”) when < (6) 
1 Way p1 Ra, px; p;) +e+p,cS(«, pa; p,) P <P,/P2 
WiePe—C 
and A, = + Some when p > 7,/Po. (7) 


A similar argument leads to the following equation for Ag, 


— Wi2d2A(Blp, ; pe) +e + G2cS(B/p, B, 2) ides a ’ 
> Wan A(B/p, 8; 1) -—€—GeS(B/p, B; pr)’ We A telt, OE P< Sete (8) 





Dividing (8) by (6) we have for p < p,/p» 


p = F(p) = Wiad A (Ble, B: Pa) +0 + 920S(B/p, B: Pad] 
[WanAlBlp, BP; p1)—¢-neS(B/p, PF: P1)) 


ye Mar Pi R(a, pa; py) +e + PreS(a, po; p4)] 


[Wyo peR(a, px; pr) —C—pycS(a, px; Po)] 





(9) 
Dividing (8) by (7) we have for p > p,/p. 


[Wi22A(B/p, 8; Pe) +¢+920S(B/p, Bs Po) (Wari +¢] (10) 
[Woi9 A(A/p, Bs px) -—¢-— 1 eS(A/p, Bs Pi) Mi2P2-€] 


We propose to solve equation (9) (or (10)) for p by iteration. The usual way of determining 
the boundaries of a sequential probability ratio test is to use Wald’s approximate formulae 
which necessarily assume the risk of errors to be very small. This is not always true in the 
case of sampling inspection plans designed on a minimum cost basis. We therefore have to 
use some sort of exact formulae. The only such formulae that so far exist in the literature 
and which are practically manageable are those given by Burman (1946). 





p = E/(p) = 
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4. BARNARD’S SCORE NOTATION AND BURMAN’S SEQUENTIAL 
SAMPLING FORMULAE FOR A BINOMIAL POPULATION 


According to Barnard (1946) any sequential probability ratio test procedure for a 
binomial population as defined by (3) can be reduced (without much loss in accuracy) to 
a scoring procedure as follows. 

Take logarithms on both sides of (3) and divide throughout by log (9,/qs), 


logh _ y_ y 108 (P2/p1) log U 


log (41/92) log (41/42) ~ log (41/42) 
log (p-_/p1) _logl U log L 
Now put bs Se. = eee. 
a log (41/42) 1 Tog (q/¢2)’ 7? ~——«dLog (Qi /42) 


If we round off b, H, and H, to the nearest integers, the sequential likelihood ratio test 
procedure reduces to the following scoring scheme. Start with a score H,, and add one to 
the score for each effective item found, subtracting b for each defective item found. Reject 
the batch if the score falls to zero or less, and accept the batch if the score reaches 


2H = H,+ Fy. 


Formulae for the average sample size and acceptance probability of such a scheme have 
been given by Burman (1946), and they are exact if b, H, and H, are integers. The error 
involved in rounding off is small if b is greater than ten, and this is often the case in practice. 

If we replace A(U, L; p), etc., as introduced in §3 by A(A,, H,; p), ete., in score notation, 
equations (9) and (10) become 


p= F(p)= _ (M24 elm E BR os lam 1, 1; p,)] 








(Wa, A (2H — 1,1; 1; p,)—c—q,cS(2H — 1, i; P1)) 
y War Pr R(b, 2H —b; p,) +¢+pyeS(b, 2H — 6; p,)] (11) 
(Mop. R(b, 2H — b; ‘P2)—¢— p2cS(b, 2H — b; pz)] 


if p < p,/p2, and 
[Wi292A(2H — 1,1; py) +e +920S(2H — 1,1; p2)] [Wari +¢] 


= F, oa et Ea: havc eee Bek © 12 
p= "lP) = ti.q,A@H—1,1; p,)—e—qeS(@H—1,1;p,)1[Mp,—e) 


log p 
er and 2H = —— 
log (9/92) log (91/42) 


rounded off to the nearest integer. 


if p > p,/Po, where 


~ 


5. PROCEDURE FOR CALCULATION 


In practice we shall first have to fix values for 2,, @2, D1, Po, Wig, Wa, and c. The quantity b 
can then be calculated, or read directly from the Target Handicap Charts (1945). To solve 
equations (11) and (12) we start with some guessed value for p, and proceed with iteration 
until we get the same value of p. Actually on the right-hand side of equations (11) and (12), 
p enters through 2H, which is an integer, so that when we get a p which gives rise to the same 
value of 2H as was used in the previous iteration, we stop and take that as our final solution. 
Once p is obtained we calculate A, and A, from equations (6) and (8) or (7) and (8) as the 
case may be. 
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The value of p is always less than one, and a lower limit for o is derived by Vagholkar 
(1955) from which we have 
c?/[Wy2(41 — 2) — €] [Waa (41 — 92) —¢] < p < 1. (13) 


From a number of examples that have been worked out so far it appears that the lower 
limit, or any number a little higher than the lower limit, can serve as a good first guess for p, 
in order to start the iteration. 


6. ILLUSTRATIVE EXAMPLES 
Example 1. a,=% p,=9001, Wy,=— 400, c=1; 
a, =4, p,=0-10, Wy = 500. 
We get b = 24, and by (13) we have 0-00065 < p < 1. Starting with a guess of 0-001 for p, 
the successive values of p obtained are, 
p —— 0-001 —— 0-001607 —— 0-001600 
Y V y 
2H = 72 2H = 68 2H = 68 
This gives A, = 30-5, A, = 0-0489, and the optimum test procedure is given by 
H, = 34, 2H = H,+H,=68, b= 24. 


Example 2. a, = 05981, p, = 0008663, W,, = 143-6, c=1; 
a, = 04019, p, = 0-041856, Wy, = 266-0. 
Here b = 46, and the limits for p are 
0-0339 < p < 1. 
Starting with a first guess of p = 0-1 we finally get 
p=0-2199, A, =4:5163 and A, = 0-9930. 
In terms of score notation the optimum test procedure is 


H,=12, 2H =H,+H,=44 and b= 46. 


7. THE COST FUNCTION 


It has been assumed above that the cost of inspection is proportional to the number of 
items inspected, and that there are no other costs involved by the use of sequential sam- 
pling. Now in practice two particular types of departure from this assumption commonly 
arise, which can be taken account of as given below. 

(i) In spite of the optimum properties of sequential schemes, they are not often used 
owing to the difficulty of inducing shop-floor staff to work them. The difficulty seems to 
arise in that a decision is required after every item inspected as to whether to continue 
sampling or sentence the batch. This can be taken account of in the cost function by adding 
an extra term which will depend on the number of times the decision to continue sampling 
or sentence the batch has to be made, which we denote D, and also on the cost asso- 
ciated with stopping sampling in order to make this decision, which we denote d. As a first 
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approximation we shall assume the extra term in the cost function to be the product of 
these two factors, Dd, though in practice a quadratic in D may be more realistic. 

(ii) It often arises that items have to be processed before inspection. For example, in 
the climatic testing of resistors (see B.S. 2011, 1954) the items sampled are taken through 
temperature and humidity cycles, and the cost of processing a group of items is the same as 
the cost of processing one. We can take account of this by adding to the cost function the 
product of the number of times the processing is carried out, denoted 7’, with the cost 
associated with processing, denoted t, so that the product is denoted 7't. 

By adding these two terms our cost function becomes 


cost = nc+ Dd + Tt. (14) 


The general problem for such cost functions will not be formulated here, but it is clear 
that the optimum test will be a likelihood ratio sequential test, with items sampled in groups 
(not necessarily of a constant size). For example, a certain valve factory sentences batches 
by inspecting an initial sample of 20 and succeeding samples (if any) of 5 valves. By limiting 
consideration to schemes with a constant group size, the problem is brought within the 
scope of the general theory, and we shall therefore do this. In defence it may be pointed 
out that practical considerations such as the capacity of the processing apparatus, or the 


need for simplicity may sometimes demand a constant group size. If the group size is k, 
equation (14) becomes 


n, 
t=nc+—d+ct 
cos ne + k ss k 
d 
=nc’, where c’=c+ Et Z (15) 


which is a linear function of the number of items inspected, with the artificial cost per 
item c’. 


8. CALCULATION OF GROUP SAMPLING BOUNDARIES 


Two particular results apply to group sequential sampling, which immediately follow 
from the results of Wald & Wolfowitz (1948). They are stated here as lemmas. 


Lemna |. The risk of a group sequential sampling scheme is greater than or equal to the 
risk for a unit-step sequential, for given prior probability and loss functions. 


Lemma 2. The optimum boundaries for group sequential sampling are within the 


optimum boundaries for unit-step sequential sampling with the same probability and loss 
functions. 


The practical importance of these lemmas is that for group sequential sampling we can 
derive the optimum boundaries for unit-step sequential sampling using the artificial cost 
per item given in equation (15). These boundaries will contain the true optimum boundaries. 
A few trial calculations will result in approximately optimum boundaries. These trial 
calculations involve calculations of the group sampling continuation risk, which will 
have to be done numerically, truncating after two or three groups. 

If the group size is large, and we can replace the two-point binomial distribution by an 


equivalent two-point normal distribution, a more exact solution is possible (Wetherill, 
1959). 
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9. MULTIPOINT PRIOR DISTRIBUTIONS 


So far this paper rests on approximating to a prior distribution by a two-point binomial. 
Vagholkar (1959) has suggested a method of doing this, and has given sampling schemes 
close to those given by the true prior distribution for examples so far calculated. Un- 
fortunately, the case of more than two ordinates is too difficul* to deal with mathematically 
at present, but we note the following result. 

Consider, for example, the three-point binomial distribution given by 


Pr(p = p;) = 4 (¢ = 1, 2,3), (16) 


where ySa;=1 and 0< py < Mo < De < ps < I, 


and zy is the critical fraction defective defined in § 1. The optimum decision boundaries for 
this prior distribution will in general meet (this holds for any number of ordinates greater 
than two). The equations are (Wetherill, 1957, p. 87) 


a, Pi gi” Wo, —a2p3q3 7 Wyo — 939393” Wig = 9, (17) 
A, PGi "(1+ py Way) + 429393 (1 — po Wig) + 4393937 (1 — pg M5) = 0. (18) 


Equation (17) is the neutral line, or locus of points such that the risks of taking the two 
decisions are equal, and equation (18) is obtained by the condition that the two decision 
boundaries must have a common point. If p, = p2, we have the two-ordinate case, and 
equation (18) reduces to 
Wiz Wor(P2— 1) — Win — Way = 9, (19) 
which is the condition that it is not worth sampling (i.e. the two boundaries are identical) 
for the two-ordinate distribution based on (9, 72). 

It is found that equations (17) and (18) give a real solution if it is worth sampling for the 
two-ordinate (p,, p3) scheme, but that it is not worth sampling for the two-ordinate p,, p, 
scheme. 


The above work is based on theses submitted for the degree of Ph.D. in the University 
of London, and our thanks are due to Prof. G. A. Barnard of the Imperial College of Science, 
under whose guidance and supervision the work was done. 
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An approximate test for serial correlation in 
polynomial regression 


By J. R. McGREGOR 
University of Birmingham 


1. INTRODUCTION AND SUMMARY 


The usual least squares methods of analysis of regression problems fail if the error terms 
in the regression model are serially correlated (see, for example, Durbin & Watson (1950) 
and Watson (1951)). Since the error terms themselves are not known, tests of serial correla- 
tion must be based on residuals from the fitted regression. 

It has been customary to use test criteria of the form z’Az/z’'z, where z’ = (z,, 2», ..., 2) 
is the vector of residuals and A is a square matrix the choice of which depends on the type of 
correlation model to be tested. The only exact tests which have been developed are for special 
cases where the regression vectors coincide with certain latent vectors of the matrix A. 
Thus R. L. & 1. W. Anderson (1950) developed a method of finding the exact significance 
points of a circularly defined serial correlation coefficient based on the residuals from 
a fitted Fourier series. 


Durbin & Watson (1950, 1951) developed bounds to the significance points of the test 
criterion a 
a (2541-2)? 
nn SE 


d= (1-1) 
DE 
j=1 
The sample serial correlation may be defined in terms of d by the relation 
r= 1-44. (1-2) 


The bounds to the significance points of d are ‘best.’ in the sense that they are attained (when 
the regression vectors coincide with certain of the latent vectors of A) and, when attained, 
the test criterion is uniformly most powerful against suitable alternative hypotheses. 
Hannan (1957) has remarked that, in the case of regression on the orthogonal polynomials, 
the upper bounds of Durbin & Watson differ from the true significance points by quantities 
of the order of n-*, where n is the sample size. 

In the present paper, an approach due to Daniels (1954, 1956) is used to obtain the 
approximate distribution of the sample serial correlation r (1-2). Hannan’s remark on the 
proximity of the significance points of d (1-1) and d,,, the upper bound of Durbin & Watson’s 
statistic, is verified and approximate significance points of d are compared with the corre- 
sponding significance points of d, tabulated by Durbin & Watson (1951). 


2. PRELIMINARY RESULTS 


Before attacking the main problem of this paper, it will be convenient to review some 
preliminary results on the expansion of certain determinants. 
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Consider the n x n determinant 





a —l1 0 0 0 
—-l1 0b —1 0 0 
ha 0 —l1 0b ost te ae 0 (2-1) 
0 0 0 b —1 
0 0 0 << =a a 





Let C; be the determinant consisting of the first 7 rows and j columns of B,, (j < n). Then 
The solution of this difference equation is 


o, — =D , ae-ajz7 
J z2—] oe Tiles 





where z+ 1/z = b. Finally, since B, = a0,,_,—C,,_,, we obtain 
eC ll li a (2-2) 


Now consider the n x » matrix 








pi-an—7 -T Co) a ae 4 
_T (a: ae. 0 
0 —T 1-27, 0 
A= y .‘ (2:3) 
0 0 2. -T 
ee 0 . w =< jae 
Factoring 7' from each row of |A| gives the form (2-1) whence, by (2-2), 
bat 2 = } _(-2T%—-T) : 2__ »2n-2 (1—27,-—T)]? : 
|A| <F aia | T i. ee 2 —=% <r | (2-4) 
where 2+1/z = (1—2%,)/T. (2:5) 


The work of von Neumann (1941) leads to an alternative expansion of |A| in terms of 
the latent roots of the n x n matrix 


y -1 ® ‘ 
oe 0 0 
@ =8 5 
en | 0 0 
. 2 sR 
a oS ack 











(21) 


“hen 


2-2) 


2-3) 


2-4) 


2-5) 


s of 
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von Neumann found the latent roots of Q to be 


. o(j—1 , 
A; = sin? —*)7 (j = 1, 2,..., 2), (2-6) 


so that the first root A, = 0. We may write A in the form 


A = (1—27,—27T)1+7Q, (2-7) 
where I is the n x n identity matrix. Let H be an orthogonal matrix which diagonalizes Q. 
—_ H’AH = (1—27,—27)1+TH’QH, 
and thus |A| = (1—21,—27) T] (1-27, —27 +7). (2:8) 
j=2 


It will be convenient to consider a third expansion of |A|. Let = be the n x n orthogonal 
matrix with elements 


bi ci 





—1)! \F(2i-1) 2. ave_n,: 
snl ~ i Aj — 1) (j-1- ny, (2-9) 
where A is the forward difference operator defined by 

Af(x) =f(w+1)—f(x) and 2 = a(x—1)(x—-2)...(u—r+1) 


with 2 = 1. From (2-7), we may write 


SA®’ = (1—27,—27)1+TEQE’. (2-10) 
It is easily verified that Q = A’A, where A is the n x n matrix 
i 0 a. Sf an Oo 
-1 1 90 0 O 
a-|® -11 0 OF 
0 0 Oo 1 0 
wa = Se 2s 








so that EQE’ = (A&’)’ AS’, where AZ’ has elements given by 
Oy =En—Ee-v % =%y=90 (6 = 2,3,...,0; j = 2,3,...,m). (2-11) 


Thus EQE’ = D, where D is the n x n matrix with elements 
dj; = 2 Ogi 93s diy => d,; => 0 (i,j = 2, 3, | (2°12) 
s=2 


Finally, from (2-10), we obtain 
|A| = |(1—27,-27)1+7D|. 


We note that, since D is obtained from Q by an orthogonal transformation, the latent roots 
of D and Q are the same and, by the invariance of the trace under orthogonal transforma- 
tions, ‘ “— 
dy, = Po D 02; = AQtAgt... tAn- (2-13) 
i=2 i=2 k=2 

8 Biom. 47 
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3. THE APPROXIMATE DISTRIBUTION OF THE SERIAL CORRELATION COEFFICIENT 


We shall consider the regression model 


_ = E15 By + S05 a+ +--+ 8ms Bm + &s (s = 1, 3, ...5%), 


where m is O(1) with respect to n and the e’s are independent N (0, 1) variables. The residuals 
2, are n 


m 
@, = Us— = Sis De Sin Me 
i= = 


m n 
= €.— bis Sine, (8 = 1,2,...,0). 
j=1 k=l 


Transform to new independent variables y, by y = Se, where Sis the n x n orthogonal matrix 
with elements £;; (2-9) and y’ = (yj, Ya, ---, Yn), € = (€1, 2, ---, €,). Then 


4 = x SeYir (3-1) 
i=m+l1 
n n 
with ra= x ¥ (3-2) 
s=1 i=m+1 


by orthogonality. From (3-1), we have 


n 
Ze1—%e = DL YelSi,041—$i,0) 
i=m+1 


n 
= 2 Yi % 041,49 
i=m+1 
where 0,; is defined by (2-11). Thus 
n—1 n n n—-1 
X (Zs41— 2s)? = 2. YiYs X O541,4%541,4 
s=1 i=m+1j=m+1 s=1 


n 


n 
= 2 ; XL HY; Fiz, (3-3) 
i=m+1j=m+1 


where d,; is given by (2°12). 
Let D,,_,,, be the (n—m) x (n—m) matrix with elements 


n—1 n 
d= © O541,1%8+1,) = 2 51%s3 (t,7 =m+l,...,m). 
s= 8= 


Then D,,_,, is a principal submatrix of D (2-12). There is an orthogonal transformation 
H,,_,, Which will diagonalize D,,_,, Jeaving the sums of squares invariant. Then 


(0) 
one Dim Hm = 
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where 1,,1,, ...,1,_ are the latent roots of D,_,,. Define new variables ¢,, €, ...,f,-m by 
C _ ae aon 
where C’ = (G1 fe, aes Ca-aal and ae id (Ym+1 Ym+2 oveg Yn)- 


Then, from (3-2) and (3-3), 
als 
n n-m n-1 n—m 
%= DG}, and > (241-2)? = ¥ C7. 
s=1 t=1 s=1 t=1 


a a 


Define the sample serial correlation to be r = c/cy, where 


. 1-1 n—m 
c=> @s—5 x (2541 - 3) =% CH(1— 3h), 
ie s=1 0 48=1 t=1 

n Ps n—-m™m 

C= L%= % ci. 
=1 t=1 
1) = ; ; ; 

The joint moment-generating function of c, and c is 


+2) M (Ty, T) = E(eTo%o+Te) =T(1 — 27, —2T +Tl)-4. (3-4) 
t=1 


It is shown in the Appendix that each (A,,,,,—1,) is positive and at most O(n) (s = 1,2,..., 


n—m), and that ¥ (A, +3—/,) is O(n-*). Thus 
s=1 


"Tl (1-27, -27 +71) ~ TI (1-27,—27 + TA) {1 + O(n). (3-5) 
t=1 


t=m+1 


eee 


Using equation (2-8), we obtain 
a Ge e —__—__ 
— (1—27,—27) [][ (1—2%)—27+TA,) 
t=2 





3) Since m is O(1) with respect to n 


(t-1)m 


A, = 4sin? is of order n-2. (t = 2,3,...,m). 


= ones ~~ 
Hence, aa —2T,—2T +TA,) ~ (1—27, —27T)" 





{1+ O(n~)}, 


and, from (3-5), 
ion |A| 


n—-m™m 
— ww ee —2 . 
(1-20-20 +h) ~ Gop apm (+ O(n-)} 


Finally, using (2-4) and (2-5), we obtain 
a n—m Tn-—m 


| II (1 — 2%) —- At + Tl,) ~ zn—m(] — z)2m (1 — 22) 


t=1 
| E _(=2%-7 ae % ae _a(l- at ‘att ) 1 {1+0(n—)}. 





i 
8-2 
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Since |z| < 1 except in critical cases where |z| = 1, we incur an exponentially small error in 
omitting the second term in the brackets. Then, from (3-4), we have 


(1 — 2rz + 22)Hn—™ (1 —z)"-1 (1—2*)8 
(1- 2u)in—m) 





M(u—rT,T) ~ {1+ O(n-*)}, 


and 
h [a —rT,T) (=) | Lt ~ (n—m-—2)(1— 2rz+22)Hn—m)-2 (1 — z)™-1(1 — 2?) {1 + O(n-)}, 


where we have made the substitutions 
uw=T%+rT and z+1/z=(1—2%)/T. 


To obtain the approximate distribution of r, we follow closely the arguments of Daniels 
(1956). The density function A(r) of r is given approximately by 


_ (n—m—2) 


hir) omni 


io (1—2rz+ 22) in—m)-2 dz 


where d(z) = (l—z)™-1(1-2)8, 


the path of integration being the same as that described by Daniels (1956, p. 172) for the 
case p = 0. Expanding ¢(z) in Taylor’s series about z = r and integrating, we obtain the 
first approximation 

(n—m— 2) T'(4(n—m)-—1) 


h(r) ~~ oa Nias) (L—r)"— (1 —r?)8—™ 11 + O(n-)}. (3-6) 





In re-normalized form, (3-6) becomes 


(l—r)™-1 (1 - r2)in—m) 
Mr) ~ 95BIA(n—m) +1, Hn +m) 





ju + O(n-*)}, (3-7) 


where B( p,q) is the usual £-function and where an argument of Daniels (1956, p. 178) shows 
that the relative error is O(n-*). 

The serial correlation r is related to the statistic d ‘1: 1) of Durbin & Watson (1950, 1951) 
by the relation r = 1—4d. Using this relation, the approximate renormalized density 
function of d is readily found to be 


d din+m)—1(4 — q)in—m acacia ee - 
~ oon RU a <d< 4). 

1d) ~ spam) FL Awemye tO | ) (38) 

It is shown in the Appendix that the exact distribution of d, in the case of regression on 
the orthogonal polynomials, differs from that of d,,, the upper bound of Durbin & Watson’s 
statistic, by quantities which are relatively O(n-*). It follows that the corresponding 
approximate distribution of d (3-8) differs from the distribution of d,, by quantities of the 
same order. Table 1 compares 5 % and 1 % significance points of d calculated from (3-8) with 


the corresponding points tabulated by Durbin & Watson (1951) for d,, for certain values 
of m and n. 
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in 
\ Table 1. 5% and 1% significance points of d (approximate) and d,, 
| k’=m-1 > 1 3 5 
f a A i = os ~~ | ee 
n d d, d d,, d d, 
16 5% 1-33 1-37 1-56 1-73 1-80 2-15 
1% 1-04 1-09 1-25 1-44 1-48 1-90 
y}, } 20 5% 1-39 1-41 1-58 1-68 1:77 1-99 
1% 1-12 1-15 1-29 1-41 1-48 1-74 
24 5% 1-43 1-45 1-59 1-66 1-75 1-90 
1% 1-18 1-20 1-33 1-41 1-48 1-66 
30 5% 1-48 1-49 1-61 1-65 1-73 1-83 
1% 1-25 1-26 1-37 1-42 1-50 1-61 
els } 40 5% 1-54 1-54 1-64 1-66 1-73 1-79 
1% 1-34 1-34 1-43 1-46 1-52 1-58 
I am indebted to Prof. H. E. Daniels for his advice during the preparation of this paper. 
} APPENDIX 
he ; Consider the diagonal elements of D defined by (2-12). These are 
he d,=0, d= > a (i = 2,3,...,), 
where @,, = £;; —&; ;-1 = Ag;,;-, with A the forward difference operator on j and £,; defined by (2-9). Then 
‘6) / n 
dy, = Y (AEs, 5-1)? 
, j=2 
we a ee ' 
= At(j — 2)¢-9 (7 —-2—n)e-9}2 et ee Al 
ai eT EEG -2-I 6 n) (Al) 
7. ae Let « = j-—2 andr = i—1. Then 
n n—2 
D {Aj — 2)4-¥ (Fj — 2— nj}? = Y {Arteta — nr}, (A 2) 
“ ? j=2 z=0 
Adopting the notation of Szegé (1939), we write this as 
51) n—2 n—2 
ty (rt)? & (Ats(a))?= Dy [ArHtat(a — ny] AfAraar—n)”}, (A3) 
2=0 z=0 
1 
where t,(x) = = Ata (a —n). 
8) 
Summing by parts on the right of (A 3), we obtain 
n—2 
on (71)? > (At,(a))? = [(Ar+ ata — ny) A(Arat(ax — nm e=9-? 
1's 7 n—2 
ng — SS (Art t(ae — ny) A(ATH(ae + 1) (w+ 1—n)). 
x2=0 
he 1 . . . . 
th Performing repeated summations by parts on the right and noting that A®*+*a(a—n)” = 0, k > 0, 
we ultimately obtain 
ies 


—2 —1 
(rt)? (Aty(z))® = (— 1* [Areata — ny) (APH +8) (w+ 8—n) EB, (A4) 
z=0 s=0 


8-3 
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It is easily verified that 
Ar-*(a +8) (2+s—n)”|p n= 0 (s>0), 


—lyri(n—l1)! 
and that Ar-(x +8) (w+8—n)| p29 =o ae 


Hence (A 4) may be written, 


ie 1)t 
—r)! 


cr" (an(ay? = 7 {L204 — Yoana 3 (TTA — Yong) (AS) 
0 


It may also be verified that 


"5 (= yeetarsetn(e— NY oag = (— 1) LAr tata — 2) eo. 


Thus ot aio alder — yen (— ILA — nya) (8) 


Using Cauchy’s integral formula, we may write, 








Artanis ny = TED f a(z—1)...(2—r+1)(z—n) (2—n—1)... (2- marr) (A) 
Cc 


Qn (z—2) (2-2-1)... (z-2—r-1) 
where the points z,7+1,...,.e+7+1 are interior to the simple closed curve C. In particular, for 
x=n-—l, 











(r+1)! 2a(z—1)...(z—r+1) 
P+1ght ye — nr) = 7 A8 
[Arriah — 8 lawns 277i | jetties ™ wae 
We may evaluate (A 8) by residues and obtain 
! — ! 
[Arete — yuan = rt] OEP “genoa ie 
Similarly, for x = —1, we get 
(n+r)! (n—1)! 
T+1y(M( » — nr) =(-—l)jt1r! skis cegneaetamnareeate . Ald 
[Arteta — nM]. = (— Lette [ — an (A 10) 


Finally, from (A 6), (A9) and (A 10), 


(r!)? 5 (At,(x))? "oe 


(n—1 —r)! n!} (n—1-r)! 


_ Ar!) (n—1)t] (n+r)! ol 








= ce ee a | (All) 
(n—1)! (n—i)! 
where r = i—1. From (A 1), (A2) and (A 11), we get 
_ (26-1) (m—1)!(m+i-1)! (n—-1)! he 
dy, = (n+i—1)! [ nl a | (+ = 2,3, ...5%). (A 12) 


Now consider the (n—m) x (n—m) matrix D,_,, with elements d,; (¢ = m+1,....n3j7 = m+ l,...,”). 
D,,_» is the principal submatrix of D located in the lower right corner. The latent roots of D are 
A, = 0 <A, <... <A,, given by (2-6). There will be no loss of generality in taking the latent roots of 
D,—~m to be 1, <1, < ... < 1y_m. By (2°13), we have 


AgtAgt---+An = (degt---+4mm) + (Ams1,meit ++» +Enn)- (A 13) 
Since the trace is invariant under orthogonal transformations, dy41, mit ++» +dan = ly tlet--.+1n—m: 
Thus from (A 13) we obtain 


DgtAgt... An = (dogt--- +dinm) +L, +let.-- +1 ams (A 14) 
and hence, 


(A, —1,-») + (An-1—Un—m-1) Feet (Ami — 1,) = (doa+ we +dinm) ra (Ag+ wee +Am)- (A 15) 


6) 


.'7) 


for 


\ 8) 


\ 9) 


10) 


11) 
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It follows from the spacing of the latent roots of principal submatrices of real symmetric matrices (see, 
for example, Turnbull & Aitken (1932)), that each term on the left of (A 15) is positive or zero. Now 
consider the terms on the right of (A 15). Since m is O(1) with respect to n, 


0<A, <A, <... <An= 4 sine" = O(n-*). 


Thus A,+...+A,, is O(n-?). From (A 12), 


2(26— 1) [(n+i—1) (n+i—2)...(n+1)—(n—1) (n— 2)... (n—t41)] 


d;, = > - 
(n+%4—1)(n+%—-—2)...(n+1)n 





(¢= 2,3, ....59). 


In particular, for 2 <7 < m, d,,; is O(n-*). Thus (Ay —Ln_m) + (Ana —Un—m—1) + +» + (Amar — 1) is O(n-?). 
Since each term in this sum is positive, it follows that each (A,,,,—1,) is at most O(n-*) (s = 1, 2,...,.n—m). 
Thus to O(n-*) we may replace 1, by A,,, (8 = 1,2,. 5 ry m). 


Durbin & Watson’s statistic d, may be written d, = =o" AmiG? |"s pt ¢?, where the ¢’s are independent 


N(0, 1) variables. The joint moment-generating function ‘of the a and denominator of d,, is 
n—m n—-™ 
MPT) = Elexp| Ty 3) G+T"S, knrsG || 
i=1 i=1 
n—-m 
= [J (1-27,-27A,,,,)-+. 
i=1 


The significance points of the distribution of d, give upper bounds to the significance points of the 
statistic d. For regression on the orthogonal polynomials, d may be written as 


n—-m n—-™ 
a="F1G/"S 
i=1 i=1 
and hence the joint moment-generating function of the numerator and denominator of d is 
n-—-m™ 
M,T,,T) = [I (1-27,—2T1,)+4. 
i=1 


It follows from the foregoing remarks that, in the case of the orthogonal polynomials, the distribution 
of d differs from that of d, by quantities which are O(n-*) since, to this order, we may replace 1; with 
Am+i: This conclusion was reached independently by Hannan (1957). 
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Nomograms for fitting the logistic function by 
maximum likelihood 


By JOSEPH BERKSON, M.D., D.Sc.* 
Section of Biometry and Medical Statistics, Mayo Clinic, Rochester, Minnesota 


1. THEORY AND DESCRIPTION OF THE TABLES 


The logistic function has been used as a model for bio-assay and other experiments in 
which the relative frequency, p, corresponding to the probability P of some ‘response’ is 
observed for some ‘stimulus’, x, and P is a monotonic function of x. The ‘response’ can be, 
for example, the deaths among n animals to which some mortally toxic drug of dosage x has 
been administered, or the explosions among 7 shells that have been subjected to a percussion 
of force x. 


The logistic function can be written 


1 
P, = 1-Q, = are (1) 


where P; is the probability of the response at z;. At each of k > 2 values of x, assumed to be 
without error, the relative frequency p; = r;/n,;, assumed to be a binomial variable on P,, is 
observed, where 7; is the number of responses out of n; trials. The parameters to be estimated 
are «, 8. Frequently it is desired to estimate the ED 50, which is the value of x corresponding 
to 100 P; = 50 %; this is given by y = —a/f. 

The maximum likelihood estimates of «, § represented as 2, f, are given by the solution for 
2, , of the estimating equations 


k k 
x 4p, = X 0; H;, (2) 
i=1 i=1 
k k 
xX 4 PiX = LY NP;X,, (3) 
i=1 i=1 
where #,, is the estimate of P; given by 
ss 1 
Bem Ty eaten’ = 


The solutions of (2) and (3) cannot in general be obtained explicitly, and an iterative 
procedure must be used which yields @, £, as limits, when these limits exist (Berkson, 1957). 
The maximum likelihood estimate of y is} = —@//. 

However, it can be seen that the estimates are functions of 


in,p; is minimal sufficient for a, with / fixed, and &n;p,;x; is minimal sufficient for £, 
with « fixed (Berkson, 1955). The maximum likelihood estimates of « and # are functions 
of In;p,;, Ln,p;2, which is illustrative of the theorem, proved early in the history of the 
subject, that when a minimal sufficient set of statistics other than the complete observa- 
tions of the sample itself exists, the maximum likelihood estimate is a function of these 
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sufficient statistics (Fisher, 1925; Neyman, 1935). Ifalln; = n, then the maximum likelihood 
estimates of «, #, will be functions of Xp;, Up;x; independent of the value of n. 

Worcester & Wilson (1943) noted in the case of three equally spaced doses with equal n, that 
the maximum likelihood estimates can be tabled against an argument of two quantities 
which, when examined, can be seen to be in one-to-one relation with Xp,, Xp,;x7,;, and these 
authors provided such a table. Cornfield & Mantel (1950) pointed out that such tables 
could be constructed for any specified dosage arrangement, and the present author (Berkson, 
1953) observed the same fact but suggested and illustrated nomographic presentation. For 
the required refinement of reading of argument, tabular presentation is formidably elaborate. 
Moreover, computation of such tables requires calculations which are quite tedious. For 
preparation of nomograms giving the estimate of the parameters from Xp, Xpx, we use the 
fact that for any value of « (or y) with varying /, the corresponding values of Xp, Xpx are 
the same values which, if found from a sample, yield as the maximum likelihood estimate 
the « (or y) in question. Similarly, one can determine the values of Xp, Xpx corresponding 
to a defined value of # with varying «. 

The nomograms presented in the following pages provide maximum likelihood estimates 
of y and # from computed values of Xp and Xpz, for equally spaced dosages and equal n at 
each dose, for 3, 4, 5 and 6 doses.* A separate nomogram is given for y and for /. 

For the most efficient spacing in the construction of the nomograms the dosages are coded 
with x = 0 at the centre of the range. But for computation, it is convenient to scale the 
dosages as 0, 1, 2, ..., (4 — 1). The dosage on the computing scale will be referred to as X, that 
on the nomographic scale as x. For the use of the nomograms, one computes Xp and XpX 
from the data recorded at dosages X. Then, for consulting the nomogram 


XLpa = LpX —4(k—1) Xp. (5) 


The value of y is given in the nomogram directly in terms of the computing scale X. In 
order to increase the sensitivity of the nomograms they are designed to give y only for 
positive values in terms of x. When the value of y in these terms is positive Xp will be 
located on the left-hand scale, when it is negative Xp will be found on the right-hand scale. 
In the latter case the value read as y in terms of X is converted to the correct value by 


A 


9 «8-1-7. (6) 
Values of ) are provided over the entire, or almost the entire, range of doses. In a well- 
designed experiment, the ED 50 should fall near the centre of the range of doses; if it falls 
far from this position the experiment should be redone. Values of £ are given over a range 
found to cover practical work. 

For the variances of the estimates, the following ‘large sample’ formulae are available: 


1 
. = ee - — 
8 (A) a n=; 4,(x;—Z)’ (7) 
A 1 1 (6-2) 
2 pene! Pe ; 
7) sw * =p; Q, (x; a z)? ’ 
where % = In; p;4;x,/Xn;P;4;- 


(8) 


* A nomogram was prepared also for 2 doses and plans for its publication announced (Hodges, 1958). 
However, since for this experiment the estimates of the parameters are given by the straight line join 
of the points represented on a logit graph, or the simple addition and subtraction of the corresponding 
logits, it was decided that its publication would be supererogatory. 
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Since, for any specified dosage arrangement, ~, 7 and Z are determined by 9, , the 
estimated variances are determined by the estimates ), # and-n. In the tables preceding the 
nomograms the standard error of each estimate, multiplied by ,/n, is given to two significant 
figures, which are enough for practical purposes. The standard errors change slowly with 
5, 2, which are tabled as argument only with sufficient refinement to determine the standard 
errors unequivocally. 


Example 1 (k = 4 doses). 2. ILLUSTRATIONS 
x n r p 
30 1 0-0333 
30 8 0-2667 <p 
15 0-5000f Spx 
30 23 0-7667 


1-5667, 
3-5668, 


} XLpe = LpX—1-5Lp = 1-21675. 


one & 
oo 
—) 


Locate 1-567 on left-hand ordinate =p scale, and 1-217 on the lower or upper abscissal Xpx 
scale. A straight edge, preferably thin and transparent, should be used as an aid in locating 
the position of the co-ordinate values. The estimates}, / can generally be read exactly to two 
significant figures with the third significant figure estimated by interpolation. In some 
sections of the nomograms an additional place of accuracy can be estimated. From the 
nomograms are found > = 2-008, f= 1-30. 


The values obtained by iterative computation are 


> = 2-0030, # = 1-2992. 
From the tables 
s(}) = 0-98/,/30 = 0-18, s(f) = 1-3/,/30 = 0-24 


so that 9 = 2014018, f= 1-304+0-24, 


Example 2 (k = 4 doses). 


xX n r Pp 

0 30 10 0-3333 

1 30 14 0-4667 xp 
2 30 20 0-6667 LpxX 
3 30 23 0:7667 | 


2-2334, 
4-1002, 


\ XLpe = LpX—1-5Xp = 0-7501. 


Locate 2-233 on the right-hand Lp scale and 0-750 on the abscissal Spa scale. From the 
nomograms are found y=1-912, f= 0-652. 
Since Xp was read on the right-hand scale, 
= 3—y = 1-088. 
The values obtained by iterative computation are 
> = 1-0935, # = 0-6520. 

From the tables, corresponding to y = 1-91 and £ = 0-65 

8(9) = 1-7/,/30 = 0-31, s(f) = 1-0/,/30 = 0-18, 
so that $= 1-:094+0-31, 2 =0-65+0-18. 
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52 18 ‘Tl 18 ‘T1 18 ‘Tl 19 ‘Tl 19 71 19 ‘Tl 2-0 ‘Tl 2-0 ‘12 20 ‘12 2-1 “T2 
54 18 ‘Tl 18 ‘Tl 18 ‘Tl 18 ‘Tl 18 ‘71 18 ‘72 19 ‘72 19 ‘12 20 ‘12 20 ‘73 
56 1-7 12 1-7 72 1-7 12 1-7 ‘72 18 72 18 ‘72 18 73 19 ‘73 19 73 2-0 “74 
58 1-7 12 1-7 72 1-7 72 1-7 ‘73 17 ‘13 1-7 ‘73 18 ‘73 18 ‘TA 18 ‘714 19 “T4 
0-60 16 073 16 O73 16 O73 16 O73 17 O73 17 O74 LT O74 18 O74 18 O75 18 0-7 
62 1-6 ‘TA 16 ‘TA 16 TA 16 ‘TA 16 ‘T4 16 ‘74 17 ‘TS 1-7 “15 1-7 ‘Th 18 7 
“64 15 74 15 T4 15 ‘TA 16 ‘T5 16 75 16 ‘15 16 ‘75 1-7 ‘16 1-7 ‘76 1-7 ‘TT 
66 1-5 75 15 ‘TS 15 ‘T5 15 7 15 75 15 ‘76 16 76 16 ‘TT 16 TT 1-7 “78 
68 15 ‘76 15 ‘76 15 ‘76 15 ‘76 15 76 15 TT 15 TT 16 ‘TT 16 ‘18 16 ‘78 
0-70 14 076 14 O76 14 O77 14 O77 15 O77 15 O77 15 O78 15 078 16 O79 16 79 
“75 1-3 78 13 ‘78 13 79 14 ‘19 14 ‘79 14 ‘79 1-4 0 1-4 80 15 81 15 a2 
80 13 80 1-3 80 1:3 81 13 81 13 81 1:3 $2 13 82 1-4 83 1-4 $3 1-4 “84 
“85 1-2 $3 1-2 83 1-2 83 1-2 83 1-2 83 «1:3 84 13 84 13 85 13 86 1:3 87 
90 1-2 85 1-2 85 1-2 85 1-2 85 1-2 86 1-2 86 1-2 87 12 88 1:3 89 13 “89 
95 1-1 87 1-1 88 1-1 88 1-1 88 1-1 88 1-1 89 1-2 90 1-2 ‘90 1-2 91 1-2 92 
1-00 1-1 90 1-1 90 1-1 90 1-1 91 1-1 91 1-1 92 1-1 92 1-1 93 1-2 94 1-2 95 
1-05 1-0 93 10 93 1-1 93 1-1 93 11 94 1-1 95 1-1 95 1-1 96 1-1 97 1-1 99 
1-10 10 096 10 096 10 096 10 096 10 097 10 097 10 098 11 099 11 10 211 10 
1-12 1-0 97 10 YT 10 97 10 98 1-0 98 10 99 10 99 10 10 11 10 211 10 
1-14 099 «63-98 099 98 099 98 10 99 10 99 10 10 10 10 10 10 10 #10 211 «10 
1-16 9 99 9 99 98 -99 098 10 099 10 099 10 10 10 10 10 10 10 10 10 
1-18 97 10 97 10 97 10 97 10 98 10 98 10 099 10 10 10 10 ‘1 10 = =| 
1-20 096 10 096 10 096 10 096 10 097 10 097 10 098 10 099 11 10 11 10 Id 
1-22 95 1-0 95 10 95 10 95 10 ‘96 1-0 96 1-1 97 1-1 98 11 O99 11 #10 I 
1-24 94 10 94 10 94 10 94 1-1 95 1-1 95 11 96 1-1 97 11 98 11 O99 1-1 
1-26 93 1-1 93 11 93 1-1 93 1-1 94 11 94 1-1 95 1-1 96 1-1 97 1-1 98 1-1 
1-28 92 1-1 92 11 92 1-1 92 1-1 93 1-1 93 1-1 94 1-1 95 11 96 1-1 97 11 
1-30 091 11 O91 11 O91 11 O91 11 O92 1:1 O92 11 0938 11 O94 11 O95 11 096 12 
1-32 90 1-1 90 Il 90 1-1 90 il 91 1-1 91 1-1 92 1-1 93 1-1 94 12 95 1-2 
1-34 89 1-1 89 1-1 89 1-1 90 1-1 90 1-1 90 1-1 91 1-1 92 1-2 93 1-2 94 12 
1:36 88 1-1 88 Ll 89 1-1 $9 1-1 89 1-1 90 1-1 90 1-2 91 1-2 92 1-2 93 1-2 
1-38 88 1-1 88 1-1 88 1-1 88 1-1 88 1-2 89 1-2 89 1-2 90 12 91 1-2 92 1:2 
1-40 087 12 O87 12 O87 12 O87 12 O88 12 088 12 O88 12 O89 12 090 12 O91 12 
1-42 86 1-2 86 1-2 86 1-2 86 1-2 87 1-2 87 12 88 1-2 88 1-2 89 1:2 90 1:3 
1-44 85 1-2 85 1-2 85 1-2 86 1-2 86 1-2 86 1-2 87 12 87 1-2 88 1:3 89 1:3 
1-46 84 1-2 84 1-2 85 1-2 85 1-2 85 1-2 86 1-2 86 1-2 87 13 87 13 88 1:3 
1-48 83 1-2 83 1-2 84 12 84 12 84 1-2 $5 1-2 85 1:3 86 13 86 «1:3 87 13 
1-50 083 12 083 12 O84 12 O84 12 OS4 12 084 13 O85 13 085 13 086 13 087 1:3 
1-52 83 1-2 83 1-2 83 13 83 «1:3 83 #13 84 13 84 13 85 13 85 13 86 1:3 
1-54 $2 13 $2 1:3 82 13 $2 13 83 13 83 13 $3 13 84 13 84 1:3 85 1-4 
1-56 81 13 82 13 82 1:3 82 13 $2 13 82 13 83 1:3 83 «13 84 14 85 1-4 
1-58 81 13 81 13 81 13 81 13 81 13 81 13 $2 1:3 82 1-4 83 1-4 84 1-4 
1-60 080 13 080 13 080 13 O81 13 O81 13 O81 13 O81 14 O82 14 O82 14 O83 1-4 
1-62 80 13 80 13 80 13 80 13 80 13 80 1-4 81 1-4 81 1-4 82 1:4 82 1-4 
1-64 ‘79 13 ‘79 13 79 14 ‘79 1-4 80 14 80 1-4 80 14 . 81 14 81 1-4 82 1-4 
1-66 ‘19 14 ‘79 1-4 79 14 ‘79 1-4 ‘19 1-4 ‘19 14 80 1-4 80 1-4 80 1-4 81 15 
1-68 ‘78 1-4 ‘78 14 ‘78 14 ‘78 14 ‘19 1-4 ‘19 1-4 ‘79 1-4 ‘79 1-4 80 15 80 15 
1-70 O78 14 O78 14 O78 14 O78 14 O78 14 O78 14 O78 14 O79 15 O79 15 080 15 
1-72 ‘TT 1-4 ‘TT 1-4 TT 14 ‘TT 1-4 ‘TT 14 ‘78 1-4 78 15 78 15 79 15 ‘79 15 
1-74 ‘TT 14 ‘TT 14 TT 15 77 1-4 TT 15 TT 15 TT 15 ‘78 15 ‘78 15 79 15 
1-76 ‘TT 15 TT 15 ‘TT 15 77 15 77 145 77 15 TT 15 78 15 ‘78 15 78 16 
1-78 ‘76 15 ‘76 15 ‘76 15 76 15 ‘76 15 ‘76 15 ‘76 15 ‘TT 15 ‘TT 16 ‘TT 16 
1-80 075 15 O75 15 O75 15 O75 15 O76 15 O76 15 0-76 15 O76 15 O77 16 O77 16 
1-82 75 15 Th 165 75 15 ‘TH 15 T 15 ‘TS 15 ‘T6 15 ‘16 16 ‘76 16 ‘TT 16 
1-84 ‘TA 15 ‘TA 15 ‘TA 15 ‘TA 15 75 15 75 16 75 16 ‘75 16 ‘76 16 ‘76 1-7 
1-86 74 15 ‘74 15 ‘74 16 ‘74 16 74 16 ‘15 16 ‘15 16 ‘75 16 ‘T 16 ‘76 17 
1-88 73 16 73 16 73 16 ‘73 16 73 16 ‘74 16 ‘74 16 ‘74 16 ‘TS 16 T5 17 
1-90 073 16 O73 16 O73 16 O73 16 O73 16 O73 16 O74 16 O74 16 O74 17 O74 17 
1-95 ‘72 16 ‘72 16 72 16 72 16 ‘72 16 72 1-7 ‘73 1-7 ‘73 1-7 ‘73 1-7 ‘73 1-7 
2-00 ‘Tl LT 71 17 ‘Tl 1-7 Tl 17 ‘Tl LT T1 17 72 1-7 ‘72 1-7 72 18 ‘72 18 
2-05 70 1-7 ‘TO 1-7 ‘70 1-7 70 1-7 ‘70 17 ‘TO 18 ‘71 18 ‘71 18 ‘Tl 18 ‘Tl 18 
2-10 69 18 69 18 69 18 69 18 70 18 70 18 70 18 70 18 70 19 #-70 19 
2-15 68 18 68 18 68 18 69 18 69 19 69 19 69 19 69 19 69 19 69 2-0 
2-20 067 19 O68 19 068 19 068 19 068 19 068 19 068 19 068 20 068 20 068 20 
2-25 67 20 67 20 67 20 67 20 67 20 67 20 67 20 67 20 67 20 67 21 
2°30 66 20 66 20 66 20 66 20 66 20 66 2-0 66 2-0 66 2-1 66 2-1 67 2-1 
2-35 65 2-1 65 2-1 65 2-1 65 21 65 2-1 66 2-1 66 2-1 66 2-1 66 2-2 66 2-2 
2-40 64 2-1 64 2-1 65 2-1 65 21 65 2-1 65 2-1 65 2-1 65 2-2 65 2-2 65 2:3 
2-45 64 2-2 64 2-2 64 2-2 64 2-2 64 2-2 64 2-2 64 2-2 64 2-2 64 2-3 64 23 
2-50 63 2-3 63 2-3 63 2-3 64 2-2 64 2-2 64 2-2 64 2-3 64 23 63 2-3 63 2-4 
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Six doses. Standard error of estimates, s,/n 





: 
t 7 
2-5 2-6 27 2-8 2-9 3-0 3-1 3-2 3:3 3-4 3-5 3-6 
ay an. 
a5) (B) 9) (B) 90) H(A) 0h) A) ¥(5) af) 065) A) a) A) 05) (B) 43) WB) 4) A) CS) 9A) 8) HD) 
22 053 22 053 22 053 22 053 22 053 22 053 23 053 23 053 23 053 24 053 24 054 25 054 
21 53 2-1 53 2-1 53 21 53 2-1 53 21 53 2-2 53 2-2 54 2-2 54 2-3 54 2:3 54 24 54 
2-0 54 2-0 54 2-0 54 2-0 54 2-0 54 2-0 54 21 54 21 54 21 54 2-2 55 2-2 55 23 455 
1-9 54 1-9 54 19 54 19 54 1-9 54 2-0 54 2-0 55 2-0 55 2-0 65 21 55 21 55 2-2 56 
1-8 55 18 55 19 55 19 55 19 55 1-9 55 19 55 (1-9 55 2-0 55 2-0 56 2-1 56 2-1 56 
18 055 18 O55 18 055 18 O55 18 0856 18 056 18 056 19 0856 19 056 19 056 2:0 0-57 20 057 
17 56 1-7 56 1-7 56 1-7 66 1:7 56 18 56 18 56 18 57 18 57 1-9 57 19 57 19 +58 
17 57 17 57 17 ‘57 (1-7 57 17 57 1-7 57 17 57 1-7 57 1:8 58 18 58 1:8 58 1-9 59 
16 57 16 57 16 57 16 57 16 67 1-7 68 17 58 1-7 59 17 59 17 59 18 59 18 59 
16 58 16 58 16 58 16 58 16 58 16 58 16 59 16 59 1:7 59 1-7 59 17 60 18 60 
15 059 15 O89 15 059 15 059 15 0859 16 059 16 059 16 060 16 060 16 060 17 061 17 O61 
15 59 15 59 15 59 15 60 15 60 1:5 60 15 60 15 60 16 61 16 61 16 61 17 8 
14 60 1-4 60 15 60 15 60 15 60 15 ‘61 15 61 15 61 15 61 1-5 62 16 62 16 63 
14 61 1-4 ‘61 14 ‘61 14 ‘61 1-4 61 1-4 61 15 62 15 62 15 62 15 63 15 63 16 64 
1-4 62 1-4 62 1-4 62 14 62 14 62 1-4 62 14 63 14 63 1-5 63 1-5 64 15 64 15 65 
13 062 14 063 14 063 14 063 14 063 14 063 14 063 14 064 14 064 14 065 15 065 15 0-66 
13 65 1:3 65 1:3 65 1:3 65 1:3 65 13 65 1:3 66 1:3 66 1:3 66 1-4 67 1-4 68 14 63 
1-2 67 1-2 67 1-2 67 1-2 67 1-2 67 1-2 68 1:3 68 1-2 68 1:3 69 1:3 69 1:3 ‘70 13 71 
1-2 69 1-2 69 1-2 69 1-2 ‘70 12 ‘70 1-2 ‘70 1:2 ‘71 12 ‘71 1-2 ‘72 +12 ‘72 «1:3 ‘73 13 74 
11 72 11 72 11 72 11 ‘72 11 ‘73 11 ‘73 11 ‘73 1-2 ‘74 1:2 ‘74 1:2 ‘75 12 76 1-2 17 
1-1 ‘75 Ll ‘75 Ll ‘7% 11 ‘75 1-1 ‘75 11 ‘76 Ll ‘76 11 77 11 77 V1 78 Ll 79 1-2 80 
1-0 77 10 ‘78 «11 ‘78 11 ‘78 Ll ‘78 11 ‘79 Ll ‘79 1-1 80 11 80 11 ‘81 11 82 11 83 
1-0 80 10 ‘81 210 ‘81 10 ‘81 10 81 1-0 82 1-0 $2 1-0 83 1-0 83 1-1 $4 11 85 11 86 
0-99 0-84 0-99 G84 0:99 0-84 0-99 0:84 0-99 084 10 085 10 085 10 086 10 087 10 0-88 10 089 10 09 
98 85 -98 ‘85 -98 ‘85 -98 -85 -98 -85 0-98 -86 0-99 -87 0-99 -87 10 88 1-0 89 1-0 90 10 Ol 
97 #86 +97 +86 -97 ‘86 -97 -87 +97 ‘87 -97 -87 -98 88 -98 -89 099 -90 1:0 91 10 92 10 93 
96 -88 -96 ‘88 -96 -88 +96 -88 -96 -89 -96 -89 -97 -89 -97 -90 -98 -91 0-98 -92 0-99 -93 10 94 
95 -89 -95 89 -95 -89 95 -89 -95 -90 -95 90 -96 -91 -96 -92 -97 -92 -97 -93 -98 -94 0-99 -% 
0-94 0-90 0:94 0-91 0-94 0:91 0:94 0-91 0:94 0-91 0-94 0-92 0:95 0:92 0-95 0-93 0:96 0-94 0-96 0:95 0:97 0-96 0-98 0-97 
93 +92 +93 -92 -93 +92 -93 -92 -93 -93 -93 -93 -94 -94 -94 -94 -95 -95 -95 -96 -96 -97 -97 -®# 
92 -93 -92 +93 -92 -93 -92 -93 -92 -94 +92 -94 -93 -95 -93 -96 -94 -97 -94 -98 +95 -99 -96 10 
91 -95 ‘91 -95 -91 -95 -91 -95 -91 +95 -91 -96 -92 -97 -92 -97 -938 -98 -93 -99 -94 1-0 95 10 
90 96 -90 -96 90 -96 -90 -97 -90 -97 -91 -97 -91 -98 -91 -99 -92 1-0 92 1-0 93 1-0 94 10 
0-89 0-97 089 0-98 0-89 0-98 0-90 0-98 0-90 0-98 0:90 0-99 090 10 091 10 091 10 091 10 0-92 10 0-93 10 
88 +99 +88 -99 -89 -99 89 -99 -89 -99 -89 1-0 89 «1-0 90 1-0 90 10 ‘91 1-0 ‘91 11 92 11 
88 10 88 1-0 88 1-0 88 10 88 1-0 88 1-0 88 1-0 89 1-0 89 11 90 11 90 11 ‘91 11 
87 10 87 10 87 10 87 10 87 1-0 87 1-0 88 1-0 88 11 88 11 89 11 89 11 90 11 
86 1-0 86 1-0 86 1-0 86 10 86 1-0 87 1-1 87 11 87 1-1 88 11 88 11 89 11 89 11 
086 11 O86 11 086 11 086 11 086 11 086 11 O86 11 O87 11 087 11 O87 11 088 11 088 LI 
85 11 85 11 85 11 85 1-1 85 1-1 85 11 86 1-1 86 1-1 86 1-1 86 11 87 11 87 Ll 
84 11 84 11 $4 11 84 11 85 1-1 85 11 85 11 85 11 85 1-1 86 1-1 86 11 87 12 
$4 11 84 11 84 11 84 11 84 1-1 $4 1-1 84 1-1 84 1-1 85 11 85 1-2 85 1:2 86 12 
83 1-1 83 11 83 11 83 11 83 11 83 1-1 83 1-1 84 11 84 1-2 84 1-2 85 1-2 85 12 
0-82 11 O83 11 083 11 083 11 083 11 0-83 12 083 12 0-83 12 083 12 084 12 0-84 12 0-85 12 
82 1-2 82 1-2 82 1-2 82 1-2 82 1-2 82 1-2 $2 12 83 1-2 83 1-2 83 1-2 83 12 84 12 
81 1-2 81 1-2 $1 1-2 ‘81 1-2 82 1-2 82 1-2 82 1-2 82 1:2 82 1-2 82 1-2 83 12 83 13 
81 1-2 81 1-2 ‘81 1:2 ‘81 12 ‘81 1-2 81 1:2 81 1-2 81 1-2 82 1-2 82 1-2 82 13 83 13 
80 12 80 1-2 80 1:2 80 1:2 80 1-2 80 1:2 81 1-2 81 1-2 $1 1-2 81 1:3 ‘81 13 82 13 
0-80 12 080 12 080 12 080 12 080 12 080 12 080 12 080 13 080 13 081 13 081 13 O81 13 
‘79 1-2 ‘79 1-2 ‘79 «1:2 79 13 80 1:3 80 1:3 80 13 80 1:3 80 1:3 80 13 80 13 80 13 
‘78 13 79 1:3 ‘79 13 ‘79 1:3 ‘79 1:3 ‘79 1:3 ‘79 1:3 ‘79 1:3 ‘79 1:3 80 1:3 80 13 80 13 
‘78 13 78 13 ‘78 13 ‘78 13 ‘78 1:3 ‘79 13 79 13 79 1:3 ‘79 1:3 ‘79 «1:3 ‘79 13 80 14 
‘78 1:3 78 13 ‘78 13 ‘78 1:3 ‘78 13 ‘78 13 78 13 ‘78 13 ‘78 13 ‘78 14 ‘79 1-4 ‘79 «14 
077 13 O77 13 0-77 13 O77 13 0-77 13 0-77 13 O77 18 O78 13 078 14 O78 14 O78 14 0-78 14 
77 «153 ‘17 13 17 13 ‘77 «13 ‘77 (13 ‘717 1:3 ‘77 14 ‘77 1-4 ‘17 1-4 ‘78 1-4 ‘78 14 ‘78 14 
76 14 76 14 ‘76 1-4 ‘77 14 77: «(14 ‘77 1-4 77 14 77 14 ‘17 1-4 ‘17 1-4 77 14 ‘78 14 
76 14 76 1-4 ‘76 14 ‘76 14 76 14 ‘76 1-4 76 14 76 1-4 77 14 77 14 ‘77 14 ‘77 15 
75 14 75 14 ‘75 14 ‘75. 14 75 14 7%5 14 75 14 75 14 76 1-4 76 14 ‘76 15 76 15. 
075 14 #O7F 14 O75 14 O75 14 075 14 O75 14 O75 14 075 14 O75 14 O75 14 076 15 0-76 15 
74 14 ‘74 1-4 74 14 ‘74 1-4 ‘74 1-4 ‘74 1-4 ‘74 15 74 15 ‘75 15 ‘75 15 ‘75 15 ‘75 15 
‘74 14 ‘74 14 ‘74 15 ‘74 15 ‘74 145 ‘74 15 74 15 ‘74 15 ‘74 15 75 15 ‘75 15 7 15 
‘74 15 ‘74 15 74 15 ‘74 15 ‘74 155 ‘74 155 ‘74 15 ‘74 15 ‘74 15 ‘74 15 74 15 ‘74 16 
73 15 ‘73 15 73 15 ‘73 15 ‘73 15 ‘73 15 73 15 73 15 74 15 ‘74 15 ‘74 16 ‘74 16 
0-73 15 O73 15 O73 15 O73 15 O73 15 O73 15 073 15 073 15 0-73 16 0-73 16 0-73 16 0-74 16 
72 145 72 16 ‘73 15 ‘73 15 ‘73 15 ‘73 16 ‘73 16 73 16 ‘73 16 ‘73 16 ‘73 16 73 146 
72 15 ‘72 16 72 16 ‘72 16 72 16 ‘72 16 ‘72 16 72 16 72 16 72 16 73 16 73 146 
72 16 ‘72 16 ‘72 16 72 16 ‘72 16 ‘72 16 ‘72 16 72 16 72 16 ‘72 16 ‘72 16 ‘73 LT) 
71 16 ‘71 16 7L 16 71 16 ‘71 16 ‘71 16 71 16 71 16 72 16 ‘72 16 72 1:7 72 17 
O71 16 O71 16 O71 16 O71 16 O71 16 O71 16 O71 16 O71 16 O71 17 O71 1-7 O71 17 0-72 17 
2-5 26 2-7 268 29 30 3-1 3-2 3-3 3-4 3-5 36 
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Nomograms for fitting the logistic function 131 
Sia doses (cont.). Standard error of estimates, s/n 
4 
3-7 38 3-9 4-0 4-1 4-2 4-25 43 4-35 44 4-45 45 
Geeta, promt, GeromDenam, gooey, qumeneDememen, Comet, cameearmweny: gman, 
45) A) 5) A) 9H) 9A) 9H) 9B) 9H) SB) 9) A) 8H) 9A) HH) 9B) HH) HA) HG) HP MP) WA 0H) 
26 0-54 26 O54 27 054 28 055 2:9 055 29 0:55 30 0:55 30 0:56 31 056 31 056 3:2 056 32 0:56 
24 55 25 +55 26 55 26 -55 27 +56 28 -56 28 56 29 56 29 57 30 -57 30 “57 31 57 
23 4-55 24 56 25 +56 25 36 26 56 27 +57 27 8 «86+57 28 86-57 280 HT 28 5B 2M BB AD CBB 
22 36 23 +56 24 57 24 +57 25 +57 26 58 26 58 26 58 27 58 27 +59 28 86-59 28 «5D 
21 +57 22 +57 23 +57 23 +58 24 +59 25 +59 25 +59 25 59 26 59 26 -6€0 27 -6€0 27 -60 
21 0-57 21 0-58 22 0658 22 059 23 059 24 059 24 060 24 060 25 060 25 060 26 061 26 0-61 
20 58 20 58 21 59 22 -59 22 -6€0 23 +60 23 61 23 -61 24 -€1 24 61 25 62 25 -62 
19 59 20 +59 21 -6€0 21 -6€0 21 -61 22 -61 22  -62 23 -62 23 62 23 -63 24 -63 24 -63 
19 +60 19 -€0 20 -61 20 -61 21 -62 21 62 22  -63 22 63 22 63 23 -64 23 -64 23 -64 
18 6118  -61 19 +62 19 62 20 -63 21 +63 21 -64 21 -64 22 -64 22 65 22 65 23 -66 
17 061 18 062 18 0-62 19 063 19 064 2:0 064 20 065 21 065 21 066 21 066 22 066 22 0-67 
17 62 17 63 18 63 18 +64 19 +65 19 65 20 -66 20 -66 20 67 21 -67 21 -68 21 -68 
16 63 17 -64 17 64 18 65 18 -66 19 -67 19 +67 19 +67 20 68 20 68 20 -69 21 -69 
16 64 16 65 17 +65 17 +66 18 67 18 +68 19 +68 19 69 19 69 19 -70 20 -70 20 -71 
16 4-65 16 -66 16 67 17 +67 17 68 18 69 18 69 18 -70 19 -70 19 +71 19 -71 20 +72 
15 066 16 0-67 16 068 16 068 17 0-69 17 O70 18 O71 18 O71 18 O71 18 0-72 19 0-72 19 0-73 
14 69 15 -70 15 -70 15 -71 16 +72 16 +73 1-7 8-74 17 +74 17 75 (1700 76 «18 76 «218-77 
14-72 14 «6-72 «140 578 15 74 15 75 15776 77 1678 16 79 16 79 17 80 17 
13-74 13 4-75 13 8-76 14 8-77 14 +79 14 80 15 81 15 -82 15 -82 15 +83 16 -84 16 -85 
12 -78 13 -79 13 80 13 -81 13 214 -84 14 -84 14 +85 14 +86 15 ‘87 15 -88 15 -89 
12 -81 12 -82 12 -83 13 -84 13 -86 13 -88 13 88 13 89 14 90 14 -91 14 -92 14 -98 
11 84 12 -85 12 -87 12 -88 12 -90 13 -91 13 -92 13 -98 13 -94 13 95 14 97 14 -98 
11 86-88 11 8-89 11 +90 12 -92 12 -94 12 -96 12 -97 12 -98 13 -99 13 10 13 10 213 10 
11 0-91 11 0-92 11 094 11 096 11 098 12 10 12 10 12 10 12 10 12 10 12 10 #13 10 
10 -93 10 -94 11 -96 11 -97 11 -99 11 10 12 10 12 10 12 10 212 421 12 21 12 Di 
10 +94 10 -95 11 97 11 -99 11 10 212 10 211 #10 #212 #10 #12 LL 12 21 128 11 #12 211 
10 96 10 97 10 -99 11 10 211 #10 UL 10 11 12 11 11 #212 «21 212 11 «218 «211 =«212 «2D 
10 9710 99 10 10 10 10 21 #10 21 212 22 21 11 #11 «11 «112 «212 11 «22128 «21 «123 «211 
0-99 099 10 10 10 10 #10 #10 421 21 21 21 11 2112 2112 11 211 «211 «2211 «11 «2212 «2212 «2123 «212 
98 10 099 10 10 10 10 4211 10 21 21 21 21 211 211 11 211 211 «21 «212 «211 «212 «212 «212 
97 10 -98 10 099 11 10 21 10 211 10 11 #211 #11 211 211 «211 «12 «211 «212 «211 «212 «211 «212 
96 10 -97 10 -98 11 O99 11 #10 %211 10 21 10 112 21 212 211 12 21 12 211 12 #211 12 
95 10 -96 11 -97 11 -98 11 10 11 10 12 #10 12 10 112 21 12 21 12 11 12 211 213 
0-94 11 095 11 096 11 097 11 O99 11 10 12 10 12 10 12 10 112 211 12 11 #1213 211 «213 
93 11 -94 11 +95 11 96 11 -98 12 10 12 10 12 10 12 10 12 11 #13 11 13 211 213 
92 11 -93 11 -94 11 -95 12 -97 12 099 12 10 12 10 12 10 13 10 13 11 13 11 138 
‘91 11 -92 11 -93 12 -94 12 -96 12 -98 12 098 12 10 13 10 13 10 13 10 13 211 13 
90 11 -91 11 -92 12 98 12 95 12 -96 12 -97 13 099 13 10 13 10 13 10 413 10 14 
0-89 11 0-90 1:2 0-91 12 0-92 12 094 12 095 13 0:96 13 098 13 099 13 10 13 10 14 10 14 
88 12 -89 12 -90 12 -91 12 -93 13 -94 13 +95 13 -96 13 -97 13 099 14 10 14 10 14 
‘87 12 -88 12 -89 12 90 12 -92 13 -93 13 -94 13 -95 13 +96 14 -98 14 099 14 10 14 
86 12 +87 12 -88 12 -89 13 -91 13 -92 13 +938 13 94 14 +95 14 97 14 98 14 0:99 14 
86 1:2 87 (1-2 87 13 88 1:3 90 13 -91 1:3 92 14 -938 14 -94 14 -96 14 97 14 -98 15 
0-85 12 086 13 0-87 13 0-88 13 089 13 0:90 14 O91 14 062 14 093 14 095 14 0:96 15 0-97 15 
8413 +85 13 -86 13 -87 13 -88 13 90 14 +90 14 ‘91 14 938 14 94 15 +95 15 -96 15 
$4 13 -84 13 -85 13 +86 13 -87 14 ‘89 14 +90 14 -91 14 92 15 93 15 94 15 -95 15 
$3 13 -84 13 -85 13 -85 14 -86 14 +88 14 -89 14 90 15 ‘91 15 92 15 -93 15 -94 16 
$2 13 -83 13 -84 14 -85 14 +85 14 +87 14 -88 15 89 15 90 15 91 15 92 16 93 16 
0-82 13 0-82 13 083 14 084 14 085 14 0:86 15 087 15 0-88 15 089 15 090 15 0-91 16 0:92 16 
8113 8114 -82 14 +83 14 -84 15 +85 15 -86 15 ‘87 15 -88 15 +89 16 -90 16 ‘91 16 
81 14 -81 14 -82 14 83 14 -84 15 -85 15 +85 15 +86 15 ‘87 16 88 16 +89 16 -90 146 
80 14 -80 14 -81 14 -82 15 -83 15 ‘84 15 +85 15 +85 16 ‘86 16 -87 16 -88 16 -90 17 
79 14 +80 14 -80 14 -81 15 -82 15 +88 15 +84 16 -84 16 +85 16 -86 16 87 1:7 89 17 
0-79 14 O79 14 080 15 081 15 O81 15 083 16 0-83 16 084 16 085 16 0-86 17 087 1:7 088 17 
7814 -79 15 +79 15 $0 15 -81 16 -82 16 -83 16 83 16 ‘84 17 85 17 86 17 87 17 
78 15 -78 15 -79 15 -79 15 -80 16 ‘81 16 -82 16 ‘82 17 ‘88 1:7 84 17 +85 17 86 18 
7615 0 3-77 15 «6-78 «1500 -78:«216 2-79 «216 S81 1681 782 788 OT BA OT 8H OT 8B OB 
76 150 -77 156 +77 15 -78 16 -79 16 80 1:7 81 17 81 17 +82 17 83 17 86-84 18 = 85 O18 
076 15 O77 15 O77 16 O78 16 0-78 16 0-79 1:7 080 1:7 O81 17 081 17 082 18 083 18 084 18 
7 15 -76 16 +76 16 -77 16 8-78 17 «6-78 «1-70 «6-79 «21702 80 1-70 81 1881 18 82 18 83 1B 
7 16 +76 16 +76 16 -77 16 4 -77 17 +78 1:7 «86-79 :17 0 «6-79 18 «6-80 18 081 18 82 18 83 19 
75 16 8-75 16 -75 16 -76 1-7 «+77 17 «8-78 «17)0« «6-78 «180 «6-79 «2182079 «18 80 18 81 19 821-9 
4 16 +74 16 -75 170 «3-75 170 «76 «2170-77 «2B 77 «sd: 78H. 79 1B 79 19 = 80 19819 
0-74 16 0-74 16 O74 17 O75 17 076 1:7 076 18 O77 18 O78 18 O78 18 079 19 080 19 0-81 1-9 
738 160 48-74 «1-70 74 17075775 B76 B77 1877 1878 1D) 78 19 79 19 80-20 
731700 738 17074 774 170 75 8 7K B76 18 76 1M) 77 178 19 = 79 20 = 179-20 
73 17 738 170 678 17074 170 74 18 75 876 19) 76 1M 77 1877 1978 20 179-20 
72:17 72 «170-7818 78 1B 78 18 74 17H 7H 76 1 77 2077 20 78-20 
0-72 17 0-72 1:7 0-72 18 073 18 073 18 074 19 O74 19 0-75 19 075 20 0-76 20 077 20 0-78 21 
3-7 3.8 3-9 40 41 4-2 4-25 43 4:35 44 4:45 45 
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(For X scaled 0, 1 



















































































Nomograms for fitting the logistic function 
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Nomograms for fitting the logistic function 
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The Borel-Tanner distribution 


By FRANK A. HAIGHT anp MELVIN ALLEN BREUER 
Institute of Transportation and Traffic Engineering, University of California, Los Angeles 


1. INTRODUCTION 


Let a > 0 and r be a positive integer. Then 
pu; 7,02) = A(z,r)e“* at" (x=4r,r+1,...) (1) 


is a probability distribution when 


- 

A(x, r) = @-nl” 1, (2) 
Borel (1942) first showed this for the case r = 1 by a geometric argument, and Tanner 
(1953) pointed out that Borel’s argument is in fact valid for general r. In the theory of 
queues p(x; r, «) represents the probability that exactly x members of a queue will be served 
before the queue first vanishes, beginning with r members, and with a equal to the traffic 
intensity, assuming Poisson arrivals and constant service time. A continuous analogue of 
this distribution occurs in a paper by Kendall (1957), p. 211, equation (20). 

Another interesting property of the distribution is its domain of definition. Most discrete 
distributions are defined over some finite subset of the integers (e.g. binomial, hyper- 
geometric) or over all of the non-negative integers (e.g. Poisson, negative binomial). In 
cases where such a domain is inapplicable, it is necessary to truncate the domain; for ex- 
ample, the Poisson is often used without x = 0. The Borel—Tanner distribution, however, 
is available for any positive integral starting point. 

First, we suppose that a < 1, and let 


f= ae. (3) 

Then F tee ee? — yt—T eax (4) 
da a 

and dp = BIL —«) (5) 

and te oe = : (6) 


iP (=a) P 
Using these formulae, the mean may be calculated as follows 


fe = Laxp(x; 1, x), | 


an = LxA(x,r) p*, ) 
where these (and all subsequent) summations are over x = r,r +1, .... Substituting (4) into 
1 = LA(z,r) fe 
gives a’ = LA(z,r) p* (8) 
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and differentiating with respect to a, we have 


prot = LA(x,r) xp . ; (9) 


From (5), (7) and (9), we obtain p=. (10) 


This result was obtained by Borel for y = 1. The variance can be found similarly 
Me = 07 + pw? = Xa*p(a; 1, a). 


Differentiating (9) and using (6) gives 
ar 
= aa (11) 


o? 
Let ¢(s) be the probability generating function of the Borel-Tanner distribution. Then 
a(s) = X(sf)* A(z,r) (O<s< 1), 


and substituting sf for f in (8), we find 


A(s) = [a(sf)/a(A)Y, (12) 
where «(/) denotes the functional dependence defined by (3). Using (3), we can write 
(8) = s" exp [ra(sf) —ra(f)]. (13) 


Since ¢(1) = 1, #’(1) =y, $"(1) = 0? +y*—y, (10) and (11) can be confirmed by differ- 
entiation of (13). 

The function «(f) can be written explicitly by use of Lagrange’s Theorem (cf. Whittaker 
& Watson, 1935) 








o y»n—l 
ES n 
a ~ nl pr. (14) 
Then ¢(s) can be written in terms of s and the parameters r and « 
2 ynr—-lyn 
#(s) = expr & ZS or—0)]. (15) 
Differentiating this with respect to s and using (10) when s = 1, we obtain 
= aa” a 
2 ntean = l-a’ (16) 


The relationship between s and ¢ is fairly complicated in both (12) and (15). Two simpler 
eaENS He log é—ardé’"—rlogs+ar = 0 (17) 


and, using subscripts to denote partial differentiation, 

ap, —8p,+asd,+rd = 0. (18) 
If K(s) denotes the cumulant generating function and M(s) the moment generating function, 
then (17) is equivalent to K(s) = ar[M(s)}" +r8—ar (19) 


and higher moments can be obtained fairly easily by differentiation of this formula. 


(9) 


(10) 


(11) 


hen 


(15) 
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Turning now to the case « > 1, we will show that Xp(x; r,«) < 1, and thereby give a 
method for computing p(0o; r,~) = 1—Zp(x; r,a). Let «’ denote the smaller root of (3), 
which is given by equation (14). Then 


Up(xz; r,a) = LA(zx,r) B*-* e-*" 


= e(a’—a)r (20) 
and therefore p(co; r,a) = 1—e*’-#r = 1~(a’ Ja)’, (21) 


where / is computed from (3) and then a’ from (14). In the intermediate case a = 1, we have 
a’ = a, and therefore the infinite component is zero. 

Finally, we mentioned two additional properties of the Borel—Tanner distribution. 
(a) Ifr is fixed, maximum likelihood estimation of « is equivalent to the method of moments, 
i.e. 

a= Fz: (22) 


(6) Since the Borel-Tanner distribution has a very long tail, the modal value M is of some 
interest. If M is large enough to justify the use of Stirling’s approximation, and a < 1, then 


f(Be)-1< M< f (be), (23) 
») = @onet 2 
where f(x) = (eit (24) 


f(x) is increasing from x = r to x = $r?+3r, at which value f > 1. However, f(M) < 1, and 
therefore the values of f(x) are only needed from x = r+1 to the first argument yielding 
f(x) > 1 for determination of M. 

When r = 1, M = 1, and when @ > 1, the greatest finite ordinate is r. 


2. TABLES 
In the table which follows, we give values of the Borel-Tanner distribution, correct to 
five places of decimals, in cumulative form, i.e. 


Px; 7,2) = ply; ra), 
y=r 
for the parameter values r = 1, a = 0-01 (0-01) 0-62. In every case, the values of x taken 
are large enough to ensure that P(x; r,a) > 0-999. 

Certain additional tables, some of which were required by the authors in a special in- 
vestigation have been computed, but owing to their incomplete character they are not 
reproduced here. They are summarized under headings I-V. 

I. Values of P(a; r,«) for r = 1, « = 0-63(0-01) 0-99, with the x values arbitrarily 
stopped at x = 34. In these cases, the cumulative sums fall short of 0-999; typical final 


values are 
a=07, P= 0-99562; a=08, P=0-97911, 


a=09, P= 0-93711; a= 0-99, P = 0-87280. 
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II. Values of p(x; r,«), correct to five decimal places, with 


a \ f a r 
0-1 0-5 1 (1) 20 0-9 1(1)8 
0-2 0-6 1 (1) 24 0-99 | 
0-3 _ 0-7 1(1) 19 0-999 ich 
0-4 0:8 1 (1) 10 





In each of these cases, the value of x is taken up to the point where p(x; r,«) < 9-01, but 
never beyond 2 = 51. Owing to the very long tail of the Borel-Tanner distribution, this 
frequently means that P(x; r,) is substantially less than unity. 

III. Values of P(x; r,«), correct to four decimal places, for r = 1, a = 1-0(0-1) 2 (1) 10, 
including the infinite value given by equation (21). In the majority of cases, values of x 
are great enough to ensure that P(x; r,«) > 0-99, but for a = 1, 1-1, the sequence has been 
truncated at x = 73. 

IV. For calculation of the mode, tables of f(~) from equation (24) for r = 2(1) 20 with 
x going from r to a value for which f(x) > 1. 

V. Fragmentary tables of p(x; r,«) and P(x; r,a), without the infinite value, but with 
x going in many cases to very large values (> 100), and with the following parameter values: 


3 a r 
1-0, 1-3 (0-1) 1-9 2, 3, 4, 5 
1-1, 1-2 2 (1) 10 
2 2, 3, 4, 5 
3 2, 3, 4 
4, 5 2 


We invite readers’ suggestions for further calculation or publication of Borel-Tanner 
tables, and in the meantime shall be glad to make our unpublished material available to 
anyone wishing to use any special results from among those listed above. 


REFERENCES 


Boret, E. (1942). Sur l’emploi du théoréme de Bernoulli pour faciliter le calcul d’un infinité de 
coefficients. Application au probléme de l’attente & un guichet. C.R. Acad. Sci., Paris, 214, 452-6. 

KENDALL, D. G. (1957). Some problems in the theory of dams. J. Roy. Statist. Soc. B, 19, 207-33. 

Tanner, J. C. (1953). A problem of interference between two queues. Biometrika, 40, 58-69. 

Waurttaker, E. T. & Watson, G. N. (1935). A Course of Modern Analysis (4th edition). Cambridge 
University Press. 


but 
this 


) 10, 
of x 
been 


with 


with 
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nner 
le to 


té de 
52-6. 
-33. 


ridge 


x 


wre = 


wonore 


em wb 


mR whdoe 


me wb 


P(z; 1, a) 
a=0-01 


0-99905 
-99985 


a = 0-02 


0-98020 
-99941 


a=0-03 


0-97045 
-99870 
-99993 


a= 0-04 


0-96079 
°99771 
*99984 


a= 0-05 


0-95123 
*99647 
-99970 


a= 0-06 


0-94176 
-99498 
*99949 


a= 0-07 


0-93239 
*99325 
-99921 


a= 0-08 


0-92312 
*99129 
*99884 
-99983 


a= 0-09 


0-91393 
‘98911 
-99838 
“99974 


a=0-10 


090484 
*98671 
-99782 
-99961 
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Table 1. The Borel-Tanner distribution (r = 1) 


x 


rR wON = rR wd = 


oF, Whe 


ar Wh = 


to 


P(x; 1, a) 
a=0-11 


0-89583 
-98411 
-99716 
-99945 


a=0-12 


0-88692 
*98132 
-99639 
-99924 


0-87810 
-97833 
*99550 
-99898 
-99976 


a=0-14 


0-86936 
*97517 
*99449 
-99866 
-99966 


a=0°15 


0-86071 
*97183 
*99335 
*99829 
*99954 


a= 0-16 


0-85214 
*96833 
*99209 
“99785 
-99938 


a=0°17 


0-84366 
*96467 
-99070 
‘99733 
‘99919 


a=0-18 


0-83527 
*96085 
*98917 


x 


P(x; 1, a) 


a= 0-18 (cont.) 


aoa oar Wh = aoa arkwhd = o> 


aoa orwhds = 


0-99674 
*99897 


0-99966 


a= 0-19 


0-82696 
-95689 
‘98752 
*99607 
-99869 


0-99955 


a= 0-20 


0-81873 
*95279 
*98572 
*99531 
*99837 


0-99942 


a= 0-21 


0-81058 
*94856 
*98379 
*99446 
*99800 


0-99925 


a= 0-22 
0-80252 
-94421 
‘98173 
-99351 
-99757 


0-99906 


a= 0-23 


0-79453 
‘93973 
‘97953 
*99246 
*99707 


0-:99882 
-99951 
a= 0-24 


0-78663 
“93514 


x 


P(x; 1, a) 


a=0-24 (cont.) 


JId®d Ar wd SIO Tm & 


ado OF Wb 


mw 


0-97719 
*99131 
-99651 


0-99855 
-99938 


a= 0°25 


0-77880 
*93043 
*97472 
*99005 
*99588 


0-99823 
*99922 


a = 0-26 


0-77105 
*92563 
‘97211 
-98868 
*99516 


0-99786 
-99903 


a=0:-27 


0-76338 
*92072 
*96937 
-98719 
*99437 


0-99743 
-99880 
*99943 


a= 0°28 


0-75578 
“91572 
-96649 
“98559 
-99349 


0-99695 
“99853 
-99928 


a= 0-29 


0-74826 
-91063 
*96348 
‘98387 


x 
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P(x; 1, a) 


a =0-29 (cont.) 


oro o 


or Wh = 


omar on 


CamaHsto arwnd = 


oCnmrrna arkwhd = 


0-99251 


0-99640 
-99823 
*99911 


a= 0-30 


0-74082 
*90546 
-96035 
-98203 
-99145 


0-99579 
*99787 
-99890 
*99942 


a=0-31 


0-73345 
-90021 
*95708 
*98007 
-99028 


0-99510 
*99746 
-99866 
-99928 


a = 0-32 


0-72615 
*89488 
-95369 
*97799 
-98902 


0-99433 
-99700 
-99838 
-99911 


a= 0-33 


0-71892 
*88948 
*95018 
*97578 
-98764 


0-99348 
-99647 
*99805 
‘99891 
-99938 
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P(x; 1, a) 
a = 0°34 
0-71177 
*88402 
*94655 


*97345 
-98616 


0-99254 
*99588 
-99768 
-99867 
*99923 


a= 0°35 


0-70469 
*87849 
*94279 
-97099 
*98457 


0-99152 
-99522 
*99725 
-99840 
-99905 


a = 0°36 


0-69768 
*87291 
*93892 
‘96840 
*98286 


0-99039 
°99448 
-99677 
-99808 
*99885 


0-99930 


a = 0°37 


0-69073 
*86727 
*93494 
-96569 
*98104 


0-98917 
*99366 
*99622 
-99771 
“99860 


0-99913 
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x 


P(x; 1, a) 
a= 0°38 


0-68386 
*86157 
“93085 
*96285 
-97909 


0:98785 
-99276 
‘99561 
-99730 
*99832 


0-99894 
-99933 


a= 0-39 


0-67706 
*85584 
-92665 
*95989 
‘97703 


*98641 
‘99177 
*99492 
*99682 
“99799 


0-99871 
*99917 


a= 0-40 


0-67032 
*85005 
*92234 
*95680 
‘97484 


0-98487 
-9906¢ 
-99416 
-99628 
‘99761 


0-99844 
-99898 
-99933 


a=0-41 


0-66365 
*84423 
-91793 
*95358 
*97253 


Table 1 (cont.) 
x P‘xz; 1, a) 
a=0-41 (cont.) 

6 0-98322 
7 


-98950 
8 -99332 
9 -99568 
10 -99718 
11 0-99814 
12 -99876 
13 *99917 
a= 0-42 
1 0-65705 
2 *83837 
3 -91342 
*95024 
5 -97009 
6 0-98144 
7 -98822 
8 -99238 
9 -99500 
10 -99669 
11 0-99778 
12 -99850 
13 -99898 
14 -99930 
a= 0-43 
1 0-65051 
2 *83247 
3 -90882 
4 -94678 
5 *96752 
6  -097955 
7 -98683 
8 -99136 
9 -99425 
10 -99613 
11 0-99737 
12 *99820 
13 -99876 
14 -99914 
a= 0-44 
1 064404 
2 *82654 
3 -90412 
+ -94320 
5 -96483 
6 0-97754 
7 -98532 


x P(x; 1, a) 
a=0-44 (cont.) 


8 0-99024 
9 -99342 
10 “99551 
11 0-99690 
12 *99785 
13 -99850 
14 -99894 
15 -99925 
a = 0-45 
1 0-63763 
2 *82058 
3 -89933 
4 -93950 
5 -96201 
6 0-97540 
7 -98371 
8 -98902 
9 -99249 
10 -99481 
ll 0-99638 
12 -99745 
13 -99819 
14 -99871 
15 -99908 
a= 0°46 
1 0-63128 
2 -81460 
3 *89445 
4 -93568 
5 -95906 
6 097314 
a -98197 
8 -98769 
9 “99148 
10 -99403 
11 0:99578 
12 -99699 
13 -99784 
14 -99844 
15 -99887 
16 0-99918 
a=0-47 
1 0-62500 
2 -80860 
3 -88950 
4 -93174 
5 *95598 


x 


P(a; 1, a) 


a=0-47 (cont.) 


CHAD ah wdoe 


10 


0-97074 
-98012 
*98625 
-99036 
-99317 


0-99511 
-99647 
-99744 
-99813 
*99862 


0-99899 
*99925 


a= 0-48 


0-61878 
*80257 
*88445 
-92769 
*95277 

0-96822 
‘97814 
*98470 
“98915 
*99221 


0:99436 
-99588 
-99698 
-99776 
*99834 


0-99876 
-99907 


a = 0-49 


0-61263 
*79653 
*87934 
*92353 
*94944 


0-96557 
-97603 
*98303 
*98782 
-99116 


0-99353 
*99522 
-99645 
*99735 
‘99801 
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ee 


Table 1 (cont.) 
x) x  P(x;1,¢) x P(x; 1, «) x P(x; 1, a) x P(x; 1, a) x P(x;1,a) 
) a=0-49 (cont.) a =0-52 (cont.) a=0-54 (cont.) a=0-56 (cont.) a= 0-58 
4 16  0-99850 3 0-86355 6  0-95030 6  0-94326 1  0-55990 
2 17 -99886 4 -91039 7 96351 7 95754 2 74172 
5 18 -99913 5 -93868 s 97277 8 -96772 3 83029 
6 9 -97942 9 97515 4 88142 
6 095681 
7 a = 0-50 : ppsscees 10 -98428 10 -98067 5 91385 
1 | 1  0-60653 8 97727 1l 0-98788 11 —-0-98483 6 0-93569 
7 2 79047 9 -98315 12 -99059 12 -98801 7 -95102 
4 3 87414 10 -98738 13 -99265 13 99047 8 -96211 
3 4 -91926 14 99422 14 -99237 9 -97032 
2 5 -94598 = ” aa 15 99544 15 99387 10 97652 
9 6 0-96278 13 99443 16 —0-99638 16 —-0-99505 1l  0-98126 
5 7 -97379 14 -99570 17 99712 17 -99599 12 98494 
8 -98124 15 -99667 18 99770 18 99674 13 -98782 
9 -98638 19 -99816 19 99734 14 -99010 
10 -99001 ” ee 20 -99852 20 -99783 15 -99191 
17 “99797 
18 11 0-99260 18 -99841 21 0-99880 21 099822 16 0-99337 
7 ! 12 -99448 19 99875 22 -99903 22 99854 17 99454 
au 13 “99585 20 -99901 23 -99879 18 99549 
9 14 -99687 oti 24 -99901 19 99626 
17 15 -99762 ome 20 -99689 
22 | 16 0-99819 a = 0-53 : rine 21 0-99741 
14 17 -99861 1 058860 3 ; ast7 22 -99784 
79 18 -99893 2 77223 4 .89633 a= 0°57 23 99819 
15 19 -99918 3 *85815 5 92680 7 24 -99848 
21 4 -90580 1 0-56553 25 -99873 
36 a=0-51 5 -93484 6 0-94685 2 -74782 -. ool 
88 1 —0-60050 6 095362 a . — . ae 
: ~ ~ 8 -97032 4 -88648 
98 j 2 “18440 7 96629 9 poinicen 5 pepoes 
76 3 -86888 8 -97509 10 .98254 
34 4 -91488 9 98135 6 0-93954 a = 0-59 
a6 5 -94239 10 -98589 Ml 0-98642 7 “95435 — 
07 6  0-95986 11 —-0-98923 - — “i — 2 73562 
13 -99161 9 -97281 : 
7 -97143 12 -99172 " posced Pes Ppermnond 3 -82456 
s -97932 13 -99359 o. anaes 4 -87627 
) i 9 -98483 14 -99501 1l 0-98312 5 -90931 
63 10 -98875 15 -99609 16 0-99576 12 pein 6  0-93171 
553 11 0-99158 «= 6S 99693 sae a pec 7 = -94754 
134 129936517 oo758 «| 8 oe = pan : 8  -95908 
353 13 -99518 18 -99808 ” ech sf _— 9 -96769 
144 14 .99632 19 998482? "99820 16 0-99426 10 -97423 
557 15 -99718 20 -99879 21 0-99854 17 99531 oo 
303 ‘ifn «sc aon = S&S 6 6SltCc RC Ce 
23 -99903 19 -99684 
303 i7 -99832 20 99740 13 -98632 
182 18 -99869 14 -98879 
116 19 -99898 a = 0-54 a = 0°56 21 0:99785 15 -99077 
353 7 1 0-58275 1 os7izn OO 6099287 
522 — 2 76613 2 -75393 4 99877 17 “99367 
645 nin 3 -85269 3 -84160 ; . 18 -99473 
735 1 059452 4 -90112 4 -89145 af “00807 19 -99560 
801 2 -77832 5 -93088 5 -92260 26 = 0-99914 20 -99632 
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x P(x; 1, a) 
a=0-59 (cont.) 


21 0-99691 
22 -99740 
23 ‘99781 
24 *99815 
25 -99843 
26 0:99867 
27 -99887 
28 -99904 
a= 0-60 
1 054881 
2 *72953 
3 *81879 
4 *87104 
5 *90465 
6 0-92760 
7 -94393 
8 *95591 
9 -96491 
10 ‘97179 
11 0:97714 
12 -98134 
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x 


P(a; 1, a) 


a= 0-60 (cont.) 


1 
2 


0-98468 
*98735 
*98951 


0-99126 
-99270 
-99388 
*99485 
-99566 


0-99633 
-99689 
-99736 
*99775 
-99809 


0-99837 
-99860 
-99880 
*99897 
-99912 


a=0-61 


0-54335 
*72344 


Table 1 (cont.) 


F P(a; 1, a) 
a= 0-61 (cont.) 


3 0-81298 
4 *86573 
5 -89989 
6 0-92336 
7 -94017 
8 *95259 
9 -96198 
10 *96921 
11 0-97485 
12 *97932 
13 -98290 
14 *98578 
15 *98812 
16 0-99004 
17 -99161 
18 *99292 
19 -99400 
20 -99491 
21 0-99566 
22 -99630 
23 -99684 
24 -99729 


«x 


P(x; 1, a) 


a=0-61 (cont.) 


25 


coe mstnon arwds = 


— 


— 
noe 


0-99767 


0-99800 
-99828 
*99852 
*99872 
-99889 


0-99904 


a= 0-62 


0-53794 
*71736 
*80712 
*86035 
*89502 


0-91899 
*93629 
*94912 
*95889 
-96647 


0-97242 
-97716 


x P(x; 1, a) 
a= 0-62 (cont.) 


13 0-98097 
14 *98409 
15 -98660 
16 0-98868 
17 *99041 
18 *99185 
19 -99305 
20 *99405 
21 0-99490 
22 *99562 
23 *99623 
24 *99675 
25 -99719 
26 0-99757 
27 -99790 
28 *99817 
29 -99841 
30 *99862 
31 0-99880 
32 “99895 
33 -99908 
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The distribution of Kendall’s score S for a pair of tied rankings 


By E. J. BURR 
University of New England, Armidale, N.S.W., Australia 


1. INTRODUCTION AND SUMMARY 


In a scatter diagram of a bivariate population (or sample), if both variates are measured 
on an interval scale, we say that there is linear regression or curvilinear regression if the 
means of arrays tend to lie close to a straight line or curve, respectively. But if either or 
both variates are measured only on an ordinal or ranking scale, in which the ratios of the 
intervals between consecutive ranks are undefined or unknown, then we can no longer 
distinguish between a (straight) line of regression and a monotonic curve of regression. 
(A rising monotonic curve is one for which x, > x, implies y, > y2 where (x,, y,) and (2, Ys) 
are the co-ordinates of any two points on the curve. For a falling monotonic curve, x, > 2, 
implies y, < y2.) In this case the terms linear correlation and linear regression are in- 
appropriate. We propose to describe this type of association between rankings by the term 
monotonic correlation, to distinguish it from cases where the regression curve contains both 
rising and falling ares. 

Given a random sample of n members ranked according to two different properties U, V, 
it is often required to test the null hypothesis that there is no association between U, V 
in the parent population, against the alternative hypothesis that there is some degree of 
monotonic correlation. The most convenient test statistic for this purpose is Kendall’s 
score (or rank sum) S. 

Methods for finding the sampling distributions of S for n = 2, 3, 4, ... have been described 
by Kendall (1955) for the case when both rankings are free from ties, and by Sillitto (1947) 
for the case when one ranking is free from ties. But in the general case when both rankings 
contain ties, the only clues to the distribution of S up to the present time have been a know- 
ledge of its mean and variance and a conjecture that ‘there is probably little important 
error involved in using the normal approximation for n > 10, unless the ties are very 
extensive or very numerous’ (Kendall, 1955, § 4-10). 

In this paper it is shown how the exact distribution of S for a sample of n members, with 
any specified permutations of tied groups in each ranking, may be expressed as a linear 
combination of certain distributions for samples of fewer than n members (§6). A table is 
given showing the 241 essentially distinct distributions of S for samples with 2 < n < 6. 
Instructions and worked examples are supplied (§7) showing how to use this table (a) to 
test the significance of an observed score when n < 6, and (b) to compute sampling dis- 
tributions when n > 6. Calculations of the second kind are sometimes tedious, but it is 
shown how they may often be avoided, either by using an independent short method for 
finding the exact significance levels of the two or three extreme values of S and of —S 
(§8), or by using the normal approximation (§9). Determination of the continuity correc- 
tion for use with the normal approximation is discussed, and examples with n = 8, 10 and 
12 are given showing the magnitudes of the errors involved in the normal approximation. 
In the cases examined, Kendall’s conjecture is found to be correct except when the ties 
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are so extensive that only a few of the extreme values of S are significant, in which case the 
normal approximation may be avoided by using the method of § 8. 


2. THE RANKINGS 


Let » members be ranked or classified according to the degree to which they possess a 
property U. Suppose for the sake of definiteness that, in the case in question, only three 
different degrees of U are distinguishable, so that the ranking separates the n members 
into three classes, having w, members in the first class, uw. in the second and w, in the third, 
with u,+U.+ us = n. For brevity we write 


{uw} = (Uy, Up, Us), 
and we call the terms wu; of {u} the class frequencies of the first ranking. 

If every class frequency is unity, so that there are n classes, we say that the ranking is 
untied; otherwise, two or more members in the same class (that is, ranked equal) are said 
to be tied. A ranking containing only two classes is called a dichotomy. In a ranking with 
three or more classes, the classes must have a natural order; in other words, the level of 
measurement must be ordinal and not merely nominal (Siegel, 1956). 

Let the same n members be ranked again according to a second property V, and suppose 
that in this case four different degrees of V are distinguishable. We write 


{v} = (v1, Ve, Ug, V4), 


where the terms v; of {v} are the class frequencies of the second ranking, and Xv; = n. 


Table 1. Ordered contingency or frequency table showing two rankings of n members 
G1 Aq Ag Ay | Y 
Ge, gg Aggy Gy | Ue First 
G3, G32 gg Gg, | Us ranking 
CO. te (ey 


Second ranking 


The result of the two rankings may be conveniently displayed in the form of an ordered 
contingency or frequency table (Table 1) with row totals {uv} and column totals {v}. We call 
Table 1 a3 x 4 table of x members. It may be thought of as a scatter diagram for a bivariate 
sample. For brevity we write {a} for the array of twelve cell frequencies or elements a,;,;. 
Some numerical examples of such arrays are given in § 8. 

In the sequel we shall often make statements about a 3 x 4 table, but it is to be under- 
stood that the numbers 3, 4 have no special significance, and are chosen only for convenience 
of exposition. Such statements will be generally true (with obvious modifications) for a 
Lx m frequency table of | rows and m columns, where 2 <1 <n,2<m<n. 


3. KENDALL’S SCORE S 


For a pair of rankings of n members, Kendall (1955) defined the score associated with 
any two of the n members as + 1 if they are ranked in the same order by the two rankings, 
— 1 if they are ranked in opposite order, and 0 if they are tied in either or both rankings. 
The total score S for the pair of rankings is the algebraic sum of the }n(n — 1) contributions 
obtained from all such pairs of members. 
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For the pair of rankings shown in Table 1, the score S is the algebraic sum of all second- 
order determinants in {a}, thus 


S = (Gy Ago — yy Aq) + (441 Gyg — Ay Ag) +... + (yg %gq — A gg 494). (3-1) 


For example, there are a,, a2, ways of selecting an untied pair of members from the two cells 
(row 1, column 1) and (row 2, column 3), and each such pair contributes + 1 to S. Similarly, 
each of the a,,@,,; untied pairs which can be selected from (row 2, column 1) and (row 1, 
column 3) contributes — 1 to S. 

In practice, the sum (3-1) may be found rapidly by grouping either the scores associated 
with each element a;; and elements in lines below it, or those associated with each element 
a,; and elements in columns to the right of it (Kendall, 1955, § 3-16). 

We observe that a row or column whose elements are all zero may be inserted in or 
deleted from Table 1 without affecting S. 


4, PROBABILITIES IN RANDOM SAMPLING 


We now regard Table 1 as the result of drawing a random sample (subject to restrictions | 
on the marginal totals {u}, {v}) of m members from a large parent population. We suppose 
for the moment that each member of this population, if drawn, would belong in a definite 
cell in Table 1. If this is not true of the whole population then we confine our attention to 
the subpopulation for which it is true. (This restriction will be removed later.) 

In practice, the investigator may or may not choose one or both of the sets of marginal 
totals before drawing the sample. When we wish to find the sampling distribution of S for 
Table 1, the question arises whether the terms in {u}, {v} should be regarded as fixed in size 
and order for all possible samples, as though the investigator had chosen them in advance, 
or whether they should be subject to random fluctuations from one sample to another. 

In the case of 2 x 2 tables this question has given rise to some controversy (Fisher, 1956), 
but it seems to be generally agreed that it is permissible, though perhaps not ideal, to regard 
both {w} and {v} as fixed. In this paper we regard {u} and {v} as fixed during sampling for 
tables of any size, mainly because of the enormous increase in complexity which results if 
we try to allow for fluctuations. 

We now set up the null hypothesis H,, that the properties U, V are independent in the 
population. This means that the probability that a member of the population will belong 
in a specified column is not affected by foreknowledge of the row in which it lies. 

Let «; be the probability that a member of the population will belong in row 1, and /£; the 
probability that it will belong in column j. Then if H,is true, the probability that the member 
will belong in the cell in row i, column j is «;f;._ We recall that the population has been so 
restricted that every sample can be displayed in a table having at most 3 rows and at most 
4 columns. In a random sample of n members with unrestricted marginal totals (save for the 
restriction in the preceding sentence), the probability of obtaining the particular cell 
frequencies shown in Table | is given by the multinomial distribution 

n! i 
G,,!@yql ... Gy! (4 By) (4 Bq) «.. (3 Pq), 
which readily simplifies to 


5 art nt oe BY Bos BS Bi. eh 
41! Qy9! ... Ogg! 
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Now in sampling with unrestricted row totals, the probability of obtaining the particular 
row totals w,, U2, vs in that order is 
n! 


Ur optls ops - 
Pam a Oy? a8 (4-2) 


Hence on dividing (4-1) by (4-2) we find that the probability of obtaining the cell frequencies 
in Table 1 in sampling with {w} fixed is 


Uy! Ug! Us! 
Pa a 


1' Bx B3* Ba. (4-3) 


Again, in sampling with unrestricted column totals, the probability of obtaining the 
particular column totals v,, v2, v3, v, in that order is 


n! 
0, ! Vg! Vg! v4! 


BY Ba BS Ba, (4-4) 


and because of H, this is true whether the row totals have been restricted or not. Hence on 


dividing (4:3) by (4:4) we find that the probability of obtaining the cell frequencies in 
Table 1 in sampling with both {w} and {v} fixed is 


Uy! Ug! Ug! Vy! Vg! Vg! V4! 


oy Gy Me’ Met 4:5 
m! ayy! yg! ... Aga! +5) 
The probability of obtaining the score S in such a sample is therefore 
he ! Uo! Ug! V1! Vg! Vg! V4! 
pS; fu}, {v}) = ee ee as nS ee , ‘4° , (4-6) 


1! yy! Qyy! ... Ogg! 


where the summation extends over all permissible arrays {a} which give rise to the score S. 
Now if we multiply both sides of (4:6) bya suitable constant, we may replace the 
probabilities p by relative frequencies P which are all integers. Two obvious ways of 
choosing this constant yield relative frequencies 


n! Uy! Ug! Us! 
PW: ny) — 9g: = 1! Ug! Ug: 4-7 
iS; (uj, (25) ,lv,lv,1o,iPe>* (u}, {v}) y1! Ayo! .. » U34 I ; 
V4! V5! Va! v,! 
PS; ” = ae ea 4:8 
{u}, {2}) = Uy om 7 POS; {u}, {2}) Gly! yg! ... Ogg! sie 
We note that UP, = n!/v,! vg! vg! v4!, DP = n!/u,! ug! us!. (4-9) 


If the smallest such constant is chosen, so that the relative frequencies are integers having 
no common factor, then we shall denote these relative frequencies by P(S; {u}, {v}). 

We observe that the score S, the probabilities p and the relative frequencies P,, P,, P 
are all unaffected by the insertion or removal of a row or column of zeros in Table 1, or any 
number of such rows or columns. This observation enables us to remove the restriction 
placed on the parent population at the beginning of this section. For suppose that the 
members of the population can be ranked in a table of (for example) 5 rows and 7 columns. 
Then Table 1 may be regarded as a 5 x 7 table whose row totals are one of the 10 sequences 
(U1, Ug, Ug, 0, 0), (1, Ue, 0, Ug, 0), ..., (0, 0, %1, Ug, Ug), and whose column totals are one of the 
35 sequences (0, 2, V3, U4,0,0,0), (v4, V2, V3 9,4, 9,0), ..., (0,0, 0, v4, Vg, Vs, 0%). Thus the 
set of all possible samples whose non-zero marginal totals have the fixed values {wu}, {v} of 
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Table 1, fall into 350 subsets according to the distribution of the zero marginal totals among 
the non-zero totals. If we confine our attention to the samples of a single preassigned subset, 
the values of S, p, P,, P,, P are as given above. But these values are independent of the 
choice of subset. Therefore the division into subsets is irrelevant, and our results are true 
for sampling in which the non-zero marginal totals are fixed, irrespective of the number or 
position of empty rows or columns. We remark that this reasoning is valid only if the null 
hypothesis is true for the whole population; it is not sufficient that the null hypothesis be 
true merely for those classes which are represented in one sample. 


5. THE SIGNIFICANCE OF S 


The exact probability distribution of S for random sampling with fixed {u}, {v} is given 
by (4:6) when the null hypothesis is true. It has been shown by Kendall (1955) that the mean 
value of S in this distribution is zero, that its variance is given by 


var S = ia fin) - Su) -¥ fie) eke Pp atu) bP atw)| + Sue) Pb iu) [» no), 


99(n) LG 
(5-1) 
where f(x) = x(x—1)(2x+5), 
g(x) = x(x — 1) (w—2), 
h(x) = x(a—1); 


and that, if the number of tied elements in each ranking is small compared with n, the 
distribution tends to normality as n increases indefinitely. It is also known that for 2 x 2 
tables the distribution of S tends to normality when w,, uv, v;, V2 all tend to infinity. It seems 
plausible that the distribution should tend to normality in the general case, except possibly 
when one of the ratios w,/n and/or one of v,/n tends to unity; but no formal proof of this 
property is known. 

Knowing the probability distribution of S, we can determine the level of significance of 
an observed score in the usual way. Thus if Table 1 yields a positive score Sy), we can compute 
the probability of obtaining a score S > S, in random sampling with fixed {wu}, {v}, and we 
call this probability the (one-tailed) significance level of Sy. If this probability is small, 
for example, less than 0-05 or 0-01, we may reject H, at the 5% level or the 1 % level, 
respectively, in favour of the alternative hypothesis that there is some degree of positive 
monotonic correlation between the two rankings, that is, that the members of the popula- 
tion tend to be ranked in approximately the same order by the two rankings. This procedure 
may be modified in the usual ways to deal with negative values of S, or with a two- 
tailed test. 

This test must be distinguished from a form of the chi-square test in which the test statistic 
- XP = = (445 — Aj;)?/Ag;, 
where A,; = u,v,/n. For a table of | rows and m columns, it is known that, when n is large 
and the null hypothesis is true, X? conforms approximately to the chi-square distribution 
with (J — 1) (m— 1) degrees of freedom. In the special case of a 2 x 2 table, this test is exactly 
equivalent to the two-tailed normal approximation to the S test. But for a table with more 
than two rows or columns the tests are quite different, as the S test depends on the order of 
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the rows and columns, while the chi-square test is independent of the order. For example, 
the 2 x 3 table with n = 30 and cell frequencies 


7 2g 
3.9 3 


yields a value of X? significant at the 1 % level (X* = 9-6 on 2 D.¥.), indicating that there is 
almost certainly some kind of association between the properties U, V; but the non- 
significant score S = 0 shows that there is no ground for belief that there is any degree of 
monotonic correlation. On the other hand, the 3 x 3 table with n = 27 and cell frequencies 


3 
3 
3 


oa 


1 


yields a non-significant value of X*, but the score S = 96 is significant at the 5 % level (two- 
tailed test), the standard deviation of S being 42-4. The S test is clearly more powerful than 
the chi-square test in discriminating between H, and the alternative hypothesis of a ten- 
dency to monotonic correlation, while the chi-square test is appropriate for testing for 
association of an unspecified kind. 


6. CALCULATION OF THE DISTRIBUTION OF S 


Except in the case of tables with only two rows (or columns), the task of enumerating 
separately all permissible sets of cell frequencies in the sum (4-6), and evaluating S and the 
contribution to p for each, is excessively tedious, even for small samples such as n = 6. 
But we now show how the distribution (4-6) for a 3 x 4 table of n members can be expressed 
as a linear combination of certain distributions for either n — u, or n —v, members. 

Of the set of all permissible arrays {a} in Table 1 with fixed {u}, {v}, consider the subset 
in which the elements of the fourth column have the fixed values a,, a,, a3. If we delete 
this column, we obtain the set of all permissible * rays in a 3 x 3 table with the fixed marginal 
totals u,— 1, U,— yg, Us — 43} Vy, Vg, Vz. Let S’ denote the score for a typical 3 x 3 array and 
S the score for the corresponding 3 x 4 array. Then by considering the terms of (3-1) which 
occur in S but not in S’, we obtain 


S = 8’ +[(u,— ay) + (Ug — aq)] ag + [(4y — 4) — (Ug — Ag) ] Gg — [ (Ue — Ag) + (Ug — 4g)] @,, 
that is, S = S’ + (uy + Ug) dg + (Uy — Ug) Ag — (Ug + Ug) A. (6-1) 
The contribution of this subset to the sum (4-8) is 


U4! cx, 14! Ug! vg! U4! P,(S’; {u}’, {v}’), 


la,! ! ( 
@,!dg!g! 44! Ayo! ... gg! ay! ag! ag! 


where {u’} = (uw, — 44, Ug — Ay, Uz — 3), {v}’ = (v4, Vg, Vg), S’ is related to S by (6-1), and the 
summation >’ extends over all permissible 3 x 3 arrays whose score is 8’. Finally, on sum- 
ming these contributions over all possible subsets, that is, all possible arrangements of the 
v, elements in the fourth column, we obtain 


v4! 
a,! a,! a,! 


P,(S; {u}, {v}) = = P,(S'; {u}’, {r}’), (6-2) 


where the summation extends over all permissible columns (4, @, @3) whose sum is v,. 
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Similarly, by considering the 2 x 4 arrays obtained by deleting the third.row of Table 1, 
it may be shown that 


a. — a Us! vs ” 7 c = 

PS; {u}, {v}) ah x b,! b,! bs! b,! P(S ? {u} ? {v} ), (6 3) 
where the summation extends over all permissible rows (b,,b,,b3,6,) whose sum is us, 
{u}” = (uy, Ue), {v}” = (v1 —,, 2 — bg, vg — bg, vg — D4), and S” is related to S by 


S = 8S" + (v1 +0q + Vg) by + (V1 + Vy — V4) bg + (V1, — Vg — Uy) Dy — (Ug + Vg + U4) dy. (6-4) 


For a ‘table’ of only one row (or column), it is obvious that S = 0 is the only possible 
score, so that p(S) = 1 for S = 0 and p(S) = 0 for S + 0. Hence we may use (6-2) or (6-3) 
to find the distribution of S for tables of two rows (or columns), then for tables of three rows 
(or columns), and so on. 

In this way the distributions of S for all possible marginal totals for n < 6 have been 
computed. They are given in Table 4. The values of P(S) for each of the 241 distributions 
in this Table were checked by computing the moments of orders zero, one and two, the 
second moment being checked against the value given by (5-1). The extreme values of S 
and the corresponding values of P were also checked in each case by the method of § 8. 

For n > 6, the distribution of S may be found by a single application of (6-2) or (6-3) 
with the aid of Table 4, provided that the number of members in the sample can be reduced 
to 6 or fewer by delecting the first or last row or column; otherwise two or more applications 
of (6-2) or (6-3) are necessary. However, for > 9 the calculation is rarely necessary (see § 9). 


7. EXAMPLES OF THE USE OF TABLE 4 


Example 7-1. Let 6 objects be ranked or graded according to the two properties U, V, 
with the result Pe a ee 


@e@eeobved (7-1) 


where A, B, C, D denote grades of U and a, b, c, d, e denote grades of V, both in order of 
decreasing degree. (Note that the letters b, e in (7-1) could equally well have been written in 
reverse order, since the objects concerned are ranked equal according to U.) In the form of 
Table 1, we obtain the 4 x 5 table 


20@080 0/3 
00100/1 
se: = Fis 
0001 0)1 
2 a. 2 fT £18 


We wish to test the null hypothesis that there is no association between U, V, against the 
alternative hypothesis that there is some degree of monotonic correlation between U, V. 
The class frequencies are {u} = (2,1,2,1), {v} = (2,1,1,1,1) and the score S = +9. 
teferring to the last page of Table 4, we locate the group of lines for which {uw} = (2, 1, 2, 1) 
(first column), and within this group we find the line for which {v} = (2, 1, 1, 1, 1) (second 
column). Adding the values of P(9), P(10), P(11) and P(13) given on this line, and dividing 
by =P, we obtain (3 + 2+ 2+ 1)/180 = 2/45 = 0-044, so that the one-tailed significance level 
of the score +9 is 4:4%. We conclude, with a degree of confidence measured by the 
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significance level of 4-4 %, that the null hypothesis of independence between U, V may 
be rejected in favour of the alternative hypothesis that there is some degree of positive 
monotonic correlation between U, V. 

If the properties U, V had been taken in reverse order, the second line of (7-1) would have 
been written first, and we should have obtained {w} = (2, 1, 1, 1, 1), {v} = (2, 2,2, 1), 8 = +9. 
The distribution P(S) for these {u}, {v} is not listed separately in Table 4, because it is 
identical with the distribution for {uw} = (2, 1, 2,1), {v} = (2,1,1,1,1). 

If the order of increasing degree had been adopted as the natural order for both U and V, 
we should have obtained {u} = (1, 2, 1, 2), {v} = (1,1, 1,1, 2), S = +9. The distribution P(S) 
for these {uw}, {v} is again identical with that for {u} = (2, 1,2, 1), {vy} = (2,1,1,1, 1). 


Example 7-2. Let » = 5, {u} = (1,1,2,1), {vo} = (1,2,1,1), S=+6. Since the class 
frequencies (1, 1, 2, 1) do not occur in Table 4 in that order, we must first reverse the order 
of the terms in {uw}, and this involves a change in the sign of S. Hence we seek the level of 
significance of the score S = —6 for the distribution for which {u} = {v} = (1, 2,1, 1). 
Referring to the appropriate line on the first page of Table 4, we add P( — 6) and P( — 8) and 
divide the sum by =P, giving (4+ 2)/60 = 1/10. Therefore the one-tailed significance level 
of the observed score is 10 %. 


Example 7:3. Let n = 6, {u} = (1,1,2,1,1), {vo} = (1,2,3), S =—8. Before entering 
Table 4, we must interchange {u} and {v} and reverse the order of the terms in (1, 2, 3). 
The second of these operations normally entails a change in the sign of S, but as the sequence 
(1, 1, 2, 1, 1) is symmetrical, the distribution P(S) is also symmetrical, and the sign of S is 
not relevant to the determination of the significance level. We now enter Table 4 with 
{u} = (3, 2,1), {v} = (1,1, 2,1, 1), and S = 8 or —8, and obtain the one-tailed significance 
level (2+ 2)/60 = 1/15. The two-tailed significance level is 2/15. 


Example 7-4. Let n = 6, {u} = (1,3,2), {v} = (1,1,1,1,1,1), 8S = 7. Since the second 
ranking is untied, the order of the terms in {u} is immaterial. The required distribution P(S) 
is grouped in Table 4 with the first {wv} whose terms are a permutation of (1,3, 2), namely, 
{u} = (3, 2,1). We obtain for the one-tailed significance level (4+ 2+ 1)/60 = 7/60 = 0-12, 
in agreement with the value given by Sillitto (1947, Table 2). The two-tailed significance 
level is 7/30. 


Example 7-5. Let n = 7, {u} = (3, 1,3), {v} = (2,1,3,1). It is required to find the dis- 
tribution of S. The simplest method is to use the expansion (6-2) in terms of distributions 
of 6 members, obtained by deleting the last column of the frequency table of 7 members. 
Since v, = 1, the only possible sets of cell frequencies for the last column (a,,a,,a,) are 
(0, 0, 1), (0, 1,0) and (1, 0,0), and the corresponding row totals {w}’ for the reduced tables 
are (3, 1, 2), (3, 0,3) and (2, 1,3). In each case v,!/a,!a,!a,! = 1. Hence (6-2) yields 

P,(S, 313, 2131) = P,(S’, 312, 213) + P,(S”, 303, 213) + PS”, 213, 213), 
where S = 8’ +4, S = 8” +0, S = 8” —4, by (6-1). The contribution P,(S’, 312, 213) is found 
from the distribution P, for {w} = (3, 1,2), {v} = (3, 1,2) by changing the sign of each score 
(due to reversal of {v}) and then adding 4 to each score. The contribution P,(S”, 303, 213) is 
the same as the distribution P, for {u} = (3,3), {v} = (3, 1, 2), since the empty row may be 
ignored. The contribution P,(S”, 213, 213) is found from the distribution P, for 


{u} = {0} = (3,1, 2) 
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by subtracting 4 from each score. The contributions are added as shown in Table 2. Some 
negative values of S have been omitted from Table 2 to conserve space, but the last line is 
easily completed by symmetry. The result may be checked by verifying that 


UP, = n!/u,! uy! us! = 140 [see (4-9)], that LS?P, = 140(varS) = 4692 [see (5-1)], 


and that the extreme scores and their probabilities agree with the values found by the 
method of §8 (see example 8-1). We conclude that the one-tailed significance levels of the 
scores 13, 11, 9, 8, ... are 3/140, 6/140, 7/140, 19/140, .... 


Table 2. Summation for example 7-5 


S —-4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 
P,(S’, 312, 213) ... 2 3 ; - ae : 6 ; 6 12 : + : ° 3 
P,(S”, 303, 213) ... ; : 6 : : ‘ 6 ; : 3 ; : ‘ 1 

P,(S", 213, 213) ... 6 ; 6 <i ae : ‘ 3 2 . . 1 

P,(S, 313, 2131) ... 8 3 12 on . as 3 8 15 : i 8 1 , 3 


Example 7-6. Let n = 8, {u} = (1, 3, 2, 2), {v} = (1, 2, 1,4). It is required to find the dis- 
tribution of S. We use (6-3) to express P,(S, 1322, 1214) in terms of the distributions obtained 
by deleting the last row. We first list the eight permissible ways of inerting the two members 
in the last row (see below). Beside each we write the corresponding term on the right of 
(6-3), complete with its multinomial coefficient. We then rewrite each of these terms, 
omitting empty columns (if any) and making any reversals or interchanges in {u}, {v} 
needed to make immediate entry into Table 4 possible. (If {u}, {v} are interchanged, P, must 
be changed to P,, and if the sign of S’ has to be changed, this must be noted.) Finally, we 
write the relation between S and S’ in each case, giving the following scheme: 


0002 1P(S’,132,1212) 1P(S’,231,2121) S=8'+ 8 
0011 2P(S’, 132, 1203)" 2P,(S’,321,231) S=S'+ 3 
0101 2P(S’,132,1113) 2P(S8’,231,3111) S=S’+ 0 
1001 2P(S’,132,0213) 2P(S’,312,231) S=S'- 3 
0110 2P(S’,132,1104) 2P(S8’,411,231) S=S'- 5 
1010 2P,(S’,132,0204) 2P(S’,42,231) S=S’- 8 
0200 1P(S’,132,1014) 1P(S8’,411,231) S=S’- 8 
1100 2P(S’,132,0114) 2P(S8’,411,231) S=<S8’-11 


The increments to be added to S’ to give S were found rapidly by writing down the 2 x 4 


9 1212 1203 1113 0213 1104 0204 1014 0114 


0002 0011 9101 1001 0110 1010 0200 1100 


and finding the score, that is, the sum of tie second-order determinants, in each. This is 
more convenient than direct substitutici in (6-4). The eight distributions are now summed 
as shown in Table 3 (in which some values of S have been omitted to conserve space). 
Each of these eight distributions is derived from the corresponding distribution P(S) in 
Table 4 by adding the appropriate increment to each score and multiplying each P by the 
multinomial coefficient and by the factor P,/P or P,/P. The sum of these sight distributions 
is the required distribution P,(S, 1322, 1214). The result may be checked by verifying that 


10 Biom. 47 
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XP, = n!/v,! vg! vg! vy! = 840 [see (4-9)], that USP, = 0, that XS*P, = 840(var S) = 43,090 
[see (5-1)], and that the extreme scores and their probabilities agree with those found by the 
method of §8. Unfortunately none of these checks would reveal a mistake caused by the 
erroneous use of a distribution with (for example) {u} = (321) instead of (312) in con- 
structing Table 3, or omitting to change the sign of S’ in such a distribution, unless the 
mistake affected one of the extreme scores. Therefore the computer should take great care 
to avoid such mistakes. The only certain safeguard is to repeat the entire calculation by 
another method, that is, by deleting a different row or column. 


Table 3. Summation for example 7-6 | 
--- 2-7-7777 - - + + + + + + + + + + + FE 
—witrmwnrtewpewsH HH te 8. wh lO OB HB HM. we i ose US 
24 12 #12 - i : - &2 - : 3 6180 
12 . ‘ 6 ; 120 
12 12 ; 240 
6 12 : 120 | 
4 . ae 60 
2 OF ict Gs 30! 
2 6 ‘ 6 30 
4 12 12 4 -' &6 60 
4 ‘ ‘ 4 12 + 36 38. 16:6. oS ..: &, 9... 6, 37 . oh ; . 3 840 | 
0-48 «. - 095 24 . 43 57 76 83 15 . 14 86 7:1 57 50 =. os . 036 L(%) 
0-49 - 15 21 . 40 54 7-1 92 12~—~—«. 12 92 7-1 54 40. ; ve ts . 0-49 LDL’ (%) 


The whole calculation described above, including all checks, was completed in about 1 hr. 
In the last two lines of Table 3 are given the one-tailed significance levels L computed 
from P,(S), and the corresponding approximate levels L’ computed from the normal approxi- 
mation to p(S), with var S = 51-3 and a continuity correction of 4 subtracted from each 
positive score and added to each negative score. Both LZ and L’ are expressed as percentages 
correct to two significant figures. It is seen that, except for two of the four extreme scores ) 


(which can be treated separately by the method of §8), the differences between L and L’ 
never exceed one-fifth of L. 


8. SIGNIFICANCE LEVELS OF EXTREME SCORES 


In the distribution p(S, {u}, {v}), let S, denote the highest positive score which occurs, } 
S, the highest but one, and S, the highest but two. It will now be shown how we can always 
find fairly quickly the exact significance levels of S, and S, (and sometimes of S,), and of 
the corresponding negative scores, working from first principles. 

We first consider the effect on the score of a transposition of two of the n members in 
one ranking. Let a set of four cell frequencies belonging to two adjacent rows and two 





adjacent columns in Table 1 be 
a b 
c ad. | 
Ifa, d are both positive (i.e. not zero), we may move one of the a members down one row and 


add it to c, provided that we simultaneously move one of the d members up one row and 
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090 add it to 6, so that the marginal totals remain unchanged. This transposition may be 
= : denoted by a b a- 1 b+ + (81) 
nite, | ec d c+l1 d—-1 
the and it is easy to show that its effect on the total score S is to reduce S by a+b+c+d units, 
sare ! provided that all the other cell frequencies remain unchanged. When the rows or columns 
| by affected by the transposition are not adjacent, we find similarly that the transposition 
a ey ey b a-l €, es ae (8-2) 
C @g & d@ c+l eg & d-1 
¥ | reduces S by (a+b+c+d) + 2(e,+e.+e3+¢,), and more generally the transposition 
a @€& @ b a-l & e& b+1 
iso | fr % % fer fr % % fe | (8°3) 
120 | C @ @& dad c+l eg & d-1l 
— reduces S by (a+b+c+d)+2(e,;+e,+e,+¢,+f,+f2)+4(x, +x), and similarly for any 
60 other numbers of intervening rows and columns. 
30 | To find the extreme score S, for Table 1, we choose the cell frequencies {a} in such a way 
vi that the n members are ranked in the same order by the two rankings, except in so far as 
this is prevented by ties. This means that no two cells with positive cell frequencies can be 
840 | joined by a line sloping upwards from left to right, and that there are no negative terms in 
(3-1). Having written down this array {a} (which is clearly unique), we at once obtain S, 
4 a from (3-1) and p(S,) from (4-6), the sum on the right of (4-6) now comprising only one term. 
To find the score S,, we examine all possible transpositions of the form (8-1) in the array 
my {a} whose score is S,. Let m be the smallest value of a+ b+¢+d which occurs (for a, 6, c, d 
ted in adjacent rows and columns and a, d both positive). (Note that in the present case at least 
ni. one of b, c must be zero.) Then S, = S,—m, and p(S,) is given by (4-6), the number of terms 
= on the right of (4-6) now being equal to the number of distinct transpositions which reduce 
yes , the score by m. The contribution to p(S,) due to each of these terms can be written down 


by inspection as a simple rational multiple of p(S,), in the form 


ad 
(+1) (41)? 
In practice it is easier to work in terms of P,(S) or P,(S), whichever is smaller. 
The determination of all arrays {a} whose score is 8, may involve transpositions of the 
form (8-2) or (8-3) or one or two transpositions of the form (8-1), and is often difficult, but 
in some cases it is quite simple, as in the following example. 


Example 8-1. Let n = 7, {u} = (3, 1,3), {vo} = (2,1, 3,1), Here 2P,(S) = 7!/2!3! = 420, 
while P,(S) = 7!/3!3! = 140, so we choose to work in terms of P,(S). In the array whose 
score is S, (see below), the transposition 


= 

SS o 

~ @ 

~ ee Se SS sa 


vO 
10 01 
te = 
| reduces the score by two units, and the transposition 
id | 10,01 
ad eos 


; 10-2 
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reduces the score by four units. Thus we obtain the arrays 


2100/3 2010/3 2100/83 
S68 Ol) \'ebies813,. 6.4.6 418 
002113 0021/13 0030138 
Por i et ae oe i et oe ee 


with scores 13, 11, 9; and we see that these are the only arrays giving scores > 9. (For 


example, the transpotition o 3. .6 Wy eo 4 
001 10 0 


in the first array reduces the score by 5.) Therefore 
S,=13, S,=11, S,=9, P(S,) =2!3!/2!2!=3, P(S,)=3, P(S,) =1, 


and the corresponding one-tailed significance levels are 3/140, 6/140, 7/140, in agreement 
with the results obtained by a longer method in example 7-5. 


Example 8-2. Let n = 10, {u} = (3, 4,3), {v} = (2, 3, 3, 2). Here 
XPS) = 10!/3! 4! 3! = 4200. 


In the array whose score is S, (see below), there are precisely four transpositions which 


reduce the score by 5. Thus we obtain the arrays 


a 1. % 6 , 2 & 0 eo: ee a 2 1-6 0 2 1 Os -@ 
Ce 2 a bk 2 ye yg A .6 a a Ee a 
o 0 I 2 OG EF & o © 1 2 oo 1 60.2 0 0 2 1 


with scores S, = 29, S, = 24 (four times). Therefore 
P,(S,) = 2!3!3!2!/(2!)4 = 9, PS.) = 18+6+6+18 = 48, 


and the corresponding one-tailed significance levels are 9/4200 and 57/4200. Perhaps less 
obviously, there are precisely four ways of obtaining the score 20 (in each case by a further 
transposition), from which we can show that S, = 20, P,(S,) = 6+6+6+6 = 24, and the 
significance level of S, is 81/4200. 


9. THE NORMAL APPROXIMATION 


In example 7-6 and Table 3 we have already indicated the accuracy of the normal 
approximation in a certain distribution of 8 members with irregularly tied rankings. 
Examination of several other distributions of 9, 10 and 12 members has indicated that for a 
given number of rows and columns the accuracy improves with increasing n, and that for 
a given value of n, the accuracy improves with increasing numbers of rows or columns. 
Since the interval between scores (other than the extreme scores) is usually one unit, it is 


advisable to apply a continuity correction of half a unit except in the special cases described 
below. 


Example 9-1. Let n = 10 and {u} = {v} = (2, 2,2, 2,2). The exact distribution p(S) was 
computed with the aid of Table 4 by a two-stage application of (6-2). The corresponding 
one-tailed significance levels L, and the approximate significance levels L’ given by the 
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normal approximation with a continuity correction of 4 and varS = 115-6, are shown 
below for all scores for which L’ lies between 0-1 and 10%: 


S 15 16 17 18 19 20 21 22 23 
L(%) 9-6 7:3 6-7 5-0 4-2 3°7 2-6 21 1-78 
L' (%) 8-9 75 6-3 51 4-3 3°5 2-8 23 1-82 
S 24 25 26 27 28 29 30 31 32 33 


L(%) 1-19 0-99 0-65 0-54 0-40 0-21 0-15 0-12 0-08 0-04 
L' (%) 1-44 1-13 0-89 0-69 0-53 0-40 0-31 0-23 0-17 0-13 


The approximation when L > 1 % is evidently so good that the computation of the exact 
distribution is scarcely necessary. When L < 1 % the errors in L’ become large, but in this 
region the research worker is not usually very much concerned with the exact significance 
level; further, because L’ is too large, the errors are ‘safe’ in the sense that they will not 
mislead the research worker into attributing significance to non-significant scores. 


Example 9-2. Let n = 10 and let both rankings be untied. The exact significance levels L 
(from Kendall, 1955, Appendix Table 1), and the approximate significance levels L’ given 
by the normal approximation with a continuity correction of 1 and var S = 125, are shown 
below for all scores for which L’ lies between 0-1 and 10 %: 


8 17 a a | 27 29 31 33 35 
L(%) 78 54 36 28 14 083 046 0623 O11 0:05 
L'(%) 76 54 37 25 16 1:00 061 0-37 O21 6-12 


Comparison of the errors in L’ in examples 9-1 and 9-2 shows that the presence of the ties 
in the rankings in example 9-1 does not greatly affect the accuracy of the normal 
approximation. 

The appropriate correction for continuity is usually 1/2, but the following exceptional 
cases should be noted. 

If one ranking (say the second) is untied, then it follows from the theory of § 8 that every 
transposition alters S by an even number, since every column in {a} contains precisely one 
member. Hence the scores occur at intervals of 2 units, and the correction for continuity 
is unity. This result also holds when both rankings are untied. 

If one ranking (say the first) is a dichotomy, then any transposition will alter S by one 
of the numbers v,+ v2, Vg+V 3, Vgt+v4, --. or by the sum of two or more of these numbers. 
Hence if the highest common factor of these numbers is h, any two successive values of S 
will differ by A (or some whole multiple of h), and the correction for continuity is $h. For 
example, if {v} = (3, 1,3, 1), h = 4; if {vo} = (3, 1, 1, 3) or (1, 3, 3, 1), A = 2; if {v} = (3, 2,3, 7), 
h = 5; if {v} = (1, 1,...,1), A = 2; if {o} = (¢,t, ..., 0), h = 2t; if {v} = (v,, v2), h = 0, +0, =n. 
This result unifies and generalizes three results given by Kendall (1955, § 4-12 (a), (0), (c)). 


Example 9-3. Let n = 12, {u} = (7,5), {v} = (6,3, 3). Here var S = 127-6, h = 3, and the 
correction for continuity is 3/2. In the following results L, L’ have the same meanings as 
before and some of the non-significant scores are omitted: 


S —30 -21 -15 —-12 ... +12 +15 +18 +24 +27 +33 
L(%) 0-76 6-4 12-1 I97 «oc LET TES 9-8 3:0 0-8 0-38 
L'(%) 0-58 4-2 WhGin TG ls) (AN, 200 7-2 2-3 1-2 0-27 
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It is seen that the normal approximation is not very good. However, the exact values of 
I for the six extreme scores S < —15 and S > + 24 can all be found without difficulty by 
the method of § 8, and since these include all the significant scores, the normal approximation 
need not be used. 

Several other distributions with n = 9, 10 and 12 were examined in this way. In some 
cases it was found that the normal approximation was not required, for the same reason as 
in example 9-3. In the remaining cases it was found that the normal approximation (with 
appropriate correction for continuity) was reasonably good at least in the region L > 1%, 
as in examples 9-1 and 9-2. We conclude that for n > 9 it is seldom necessary to compute 
the exact distribution of S by the method of examples 7-5 and 7-6, except when great 
accuracy is required. 

Finally, we comment further on the correction for continuity when the first ranking is 
a dichotomy and the second is tied in any manner. Whitfield (1947) suggests that the mean 
interval between successive values of S is approximately the mean of the numbers v, + v2, 
Vg+Vg, Vg +4, -.-, 80 that the continuity correction is one-half of this mean interval. But 
the mean interval computed in this way is usually too large, as it gives the mean change in 
S produced by a single transposition involving adjacent columns, whereas in most cases 
a change from any value of S to the next attainable value involves two or more simultaneous 
transpositions which change S by a number of the form 


(Vy +V_)—(Vet+Vs) OF (V1 +V_)— (Vg +4). 


Thus in example 9-3, the true mean interval between successive values of S (omitting 
the extreme values of + 33 and — 30, for which the continuity correction is never required) 
was found to be 4, which is closer to the highest common factor of 6+ 3 and 3+3 (hk = 3) 
than to their mean (= 7-5). 


Example 9-4. Whitfield (1947) and Kendall (1955, examples 3-4 and 4-5) discuss a sample 
with n = 17 whose rankings may be displayed in the following 2 x 7 table: 


402420 0/12 
0101111) 5 


41265638311\/17 





Here S = 34, varS = 344-6, h = 1, and the mean of 4+1,1+2,...,1+1 is 29/6. With a 
continuity correction of 1/2 we obtain L’ = 3-54%, but with a correction of 29/12, 
L’ = 4-45 %, (both one-tailed). The exact distribution P,(S) for S > 23 was found from (4-8) 
by enumerating the possible ways of arranging the five members in the second row, and 
yielded the exact significance level L = 3-36 % for S = 34. (Whitfield obtained L = 3-68 %.) 
Comparison of L with the two values of L’ shows that the continuity correction of 1/2 is 
the more appropriate. The reason for this is that every integral score in (at least) the range 
23 < S < 42 is attained, with the fortuitous exception of S = 35. 


10. RELATIONS WITH OTHER TESTS 


In the case of a 2 x 2 table only one kind of association is possible, since the cell frequencies 
have only one degree of freedom. The S test is then equivalent to Fisher’s exact probability 
test, and the normal approximation to the S test is equivalent to the chi-square test. 
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In the case of a 2 x m table, where the first ranking is a dichotomy and the second may or 
may not contain ties, the S test is equivalent to the Mann—Whitney U test, in which U is 
the sum of the positive terms only (or negative terms only) in the sum of the form (3-1) 
defining S. For example, if n = 6 and {uw} = (3, 3) and {v} = (1, 1, 1, 1, 1, 1), the significance 
levels of S = 9,7, 5,3 and 1 are identical with the significance levels of U = 0,1, 2,3 and 4, 
respectively. 

Hence the S test includes as special cases the Fisher test for association in 2 x 2 con- 
tingency tables, and the Mann-Whitney U test. For descriptions of these tests, bibliography 
and tables of distributions and significance points, see Siegel (1956). 
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Table 4. Distribution of Kendall’s score (of rank sum) S, 
for samples of 2, 3, 4, 5 and 6 members 


Explanation of the table 


Each of the 241 distributions occupies one line of the table. The distributions for samples of n = 2, 3, 4 
and 5 members are shown on the first page, and the distributions for samples of nm = 6 members are 
shown on the remaining three pages. For each distribution, the figures in the columns from left to right 
show: 

{u}, the class frequencies in the first ranking (or row totals). 

{v}, the class frequencies in the second ranking (or column totals). 

P(S), the relative frequency of obtaining the score S (shown at the head of the column), in random 
sampling with fixed {u}, {v} from a population in which the criteria of ranking are independent. 

=P, the sum of the relative frequencies. Division by &P converts the relative frequencies P(S) to 
probabilities. 

P,/P, the factor by which P(S) must be multiplied to obtain P,(S), the relative frequencies such that 
=P, = n!/(v,!v_!...). Similarly UP, = n!/(u,!u,_!...). The numbers P,, P, are only required when 
computing a distribution for n > 6. 

If the observed sets of class frequencies {uw}, {v} do not occur in Table 4, they must be modified in 
accord with one or more of the following rules before entering the Table: 

(a) If the order of the terms in either {w} or {v} (but not both) is reversed, the sign of every score S 
is to be changed. 

(b) If the order of the terms in both {uv} and {v} is reversed, the distribution of S remains unchanged. 

(c) If{u} and {v} are interchanged, then P, and P, are to be interchanged, but P(S) remains unchanged. 

(d) If one ranking contains no ties (i.e. every class frequency is unity), the distribution is the same for 
every permutation of the class frequencies in the other ranking. 

(e) Empty classes in {u} or {v} or both may be ignored. 


Test of significance for n < 6. To find the level of significance of an observed score, first locate the 
appropriate line of the table and denote the observed score (after changing its sign if necessary) by S5. 
Add the tabulated relative frequencies P for the score S, and all scores to the right of S, (if Sp is positive) 
or to the left of Sy (if Sy is negative). Divide the sum by the tabulated value of 2P. The quotient is the 
‘one-tailed’ significance level. If the distribution is symmetrical (which occurs when {wu} or {v} or 
both are symmetrical) the quotient may be doubled to obtain the ‘two-tailed’ significance level. (See 
examples 7-1—7-4.) 


Distributions for n > 6. For the sake of definiteness suppose that 
{u} = (Uy, Ue,Us) and {v} = (v,,—,03,%4), Where Lu= XLey=n> 6. 
Then in terms of distributions of n — v, members, 
v,! 
P,(S, {u}, {v}) = .—-*—_P,,(8’, {u}’, {v}’), 
a(S, {0}, (0)) = ET Pal’ (u's (oF) 


where {u}’ = (uy — Gy, Ug — Ag, Uz — Ag), {V}’ = (v4, Vg, V3), and the score S is obtained from S’ by adding the 
scores associated with the cell frequencies @,, @,, a, in the last column, thus 


S= 8’ + (uy + Ug) ag + (Uy — Ug) dg + (— Ug — Ug) Ay; 


and the summation extends over all permissible sets of non-negative integers a, a), 4, whose sum is v4. 
Similarly in terms of distributions of n — u, members 


u — Us! ‘a ? £07 
PAS, (0,103) = Ep rp gp p PHS ed's OF) 
where {u}’ = (uy, Ug), {v}’ = (vy — by, ..., 0g — Dg), 
and S = 8’ + (vy + Vet Vg) by + (Uy + Vg — U4) bg + (VU — Vg — Vg) Og + ( — VQ— Vg — U4) Dy. 


(See examples 7-5 and 7-6.) 























11] 


31 


21 


32 


Table 4 
See explanation on opposite page 166 









































i 
P(S) 
¢ Fn LS a 
— = _ —_ _ aes —_ —_ ae A: £. 4 + 4. 1. 4. is + P, P, 
{u} {v} S 8 Te & BIS SPT oe rr 2: sia S aks a See oS 
| u 11 1 1 SS PS 
| 21 21 2 1 a eo 
3,4 111 1 1 1 f° 2. a 
8 are ll 111 ; : : - : : ice ee 2 ] S$ 1 1 
right 
31 31 eee Eee i eee Mh... pins Lp Os ging 1 
22 1 1 oe 
211 7) atts 55 SS boars ae : 1 1 43" 3 
121 et Ne 0 oer ee. 2 ee ar | wirsc ys 
.\dom 1111 z 2 ‘ ; ; ; 1 ; 1 : 1 , 1 4 6 1 
: | 22 22 gS te MB 4 2 LS: 2 GER, 2 eee a a 
S) to 211 ~ > Sa, TR ag! Ss a eee 6. Ce gigs 
121 ae ‘Ge S5,.? erro ae 1 5. Se: ve . 4 2 
that 1111 : ; : : : 1 ; 1 2 1 1 eii-4 1 
when 
211 211 oh. be. ke be, CREP tee UR A fh: & 4 1 oo aa 
f 121 RA +e oP ; 5 SS SP SS a SS) - me 2, a 
ed in rebel | 2 3 3 2 i tes: 4 
ore S 121 121 a5 we ks i'M « , Oe. ee a te ®, ae RS ae ig a. 3 
eee 1111 Fe ae Fe it“. eee 2a... SS eae 2. 2 > eae Sia 4 
nged. 
aged. 41 4l a Sore a ok © <a oe ; > ai l Be 6 fro 
1e for 32 3 - 2 5 2 1 
311 a Se te ee a ae : a. a 1 Re | 
131 aoe ae a ae |: Ie ee 1 Stas ef 
221 2 2 1 a a | 
212 2 1 ae Sirlhe 4 
e the 2111 2 1 1 1 i ee 
ry So. 1211 SO a , ¢ & ; tate. 1 BS 1 Siig 2 
itive) : 11111 es 5, . l CE os 1 oe a | 
is the 32 32 3 BR 2 1 POLE: y 
v} or 311 i me hae Wald Tk .. Weta: he: Re ae LS ae mits: 3 
(See 131 : s . ‘ 5, " 3 : 1 , . 3 3 ; ; ; : : ao 1 
j 221 - : 5 1 . J ‘ 4 A 2 1 , 2 ‘ 10 3 1 
212 oe 1 2 ; { 2 1 mr Cae. « 
5 2111 : 4 - 1 5 ; 2 2 : 2 ] 1 1 10 6 l 
1211 i tae oe! ey a ri2 2 l ae Si 
} 11111 3 ah. «, ya 8 2 2 2 1 1 10 12 1 
311 311 z ; 4 : : 6 5 3 6 l 1 20 1 1 
131 a aa ae ee. oe ke 3 3 Sit 4 
221 ee teen tr. 2. RS ; 1 1 6 «6Sls 
212 ! 1 : Si. ae a be. of 6 Ss 2 
2111 2 2 » tl \ rt £ 2 3 l 20 3 #1 
ig the ; 1211 eae S91. + Set t &. 2k a 1 o's. 4 
11111 ; é l 2 3 1 t 3 2 1 20 6 1 
131 131 tier 2 y A Bee 6 ( l Wik 3 
221 : : : 1 ; ; 2 1 2 1 2 1 10 3 2 
is v,. 212 a ee se: wae 1 ; ¥ 1 oh 
2111 » k «ia. WHE. @& silos 2 wis 2 Hees 3 
1211 . & oa ; & #8 ic iat? « .- A 1 oe a | 
221 221 a) c-', 4 w.)2° tobeaes a t 2 1 ae a 
212 2 . * : 2 ¢|/2 4 5 2 i ee 
2111 , : 2 2 l } | 4 l 6 1 2 I l 30 2 1 
1211 i cs 1 Gres 2 “Sirs s 2 9 2 1 
11111 . @ 2 4 5 6 5 4 2 I woe 4 
212 212 a = 8 { 8 1 ! Sui» @ 
2111 ae sae rigs 4 { re on 1 be 2a 1 are: 4 
1211 i. oe , 3 cE Cc Sere & eH es 2 2 1 
3 
2111 2111 2 1 8 10 12 eo £736. 2738 l aa a 
1211 . & .~ ee . @ io ieee <8 Se 2 ae SB. eit 3 
11111 Bn: Sabs wok Te eee se ; la 9 6 3 1 oO 2) 4 
1211 1211 ; 2 3 4 | 6 2 8 1 12 10 8 1 ] 1 60 1 l 
, ii mill * ££ oe) 2 & a ee ee ee er eg) le Ce 


* For {u}={v}=(1, 1, 1, 1, 1), P(—10) = 1 and P(+10)=1. 
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2121 
2112 
1221 
21111 
12111 
11211 
111111 


42 
33 
411 
141 
321 
312 


231 
222 
3111 
1311 
2211 
2121 


2112 
1221 
21111 
12111 
11211 
111111 


33 

411 
141 
321 
312 


231 


222 

3111 
1311 
2211 
2121 
2112 


1221 
21111 
12111 
11211 
111111 


111 
41 
321 
312 
231 
222 
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See explanation on page 166 
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The distribution of Kendall’s score S for a pair of tied rankings 
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On normalizing the incomplete beta-function for fitting to dose-response curves 


By M. E. WISE 
Radiological Protection Service, Belmont, Sutton, Surrey 


SUMMARY 


It is shown that if the probability integral of x is the incomplete beta-function I,( p,q) = «, the cube 
root of —log-z is distributed normally to good approximation for all g > 2 and p > q, at least for pro- 
babilities « between 0-05 and 0-95. This makes it easy to fit dose-response curves when «@ is the pro- 
bability of response and x depends on the dose, and to do so more simply than from another transforma- 
tion of 2 recently published in Biometrika. 


1. INTRODUCTION 


Recently Kimball & Leach (1959) put forward an interesting application of the incomplete beta 
distribution to a radiation dose-response problem. The instance quoted could arise in ‘multiple target 
theory’. Given a finite number of targets, the distribution of the number of hits for a given dose of 
radiation can be binomial. If response occurs only after at least q hits, the probability « of response is 
of the form 


bd 1 
a = I,(p,q) =| t7-1(1 —t)7-1 a/ f t?-\(1—t)9— dt, 
0 0 


where x is a function of the dose. There are, of course, many other situations that could give rise to a 
similar relationship. 

The authors’ problem was to replace x by a normally distributed variate, by means of a transformation 
of x not involving p and qg. Then the normal deviate y, corresponding to a, such that 


wo 
— exp — }dt = « 
V2n Jy, 


can be written Yq = A+BX, 


where X is the transformed function of x and A and B are functions of p and q. As they point out, this 
is just the form of relationship needed for a probit analysis of the dose-response data. 
The transformation given is rather complex, namely 


1" et 


5 
"Tse stint gO?) 


3 


a 





Their numerical tables show that it achieves its aim except when q is between about 0-8p and 0-9p, 
when the error is often large. But the results present a challenge to find something better or simpler. 
The improved solution is very simple, namely 


X3 = —log,x. 


2. THE BASIS OF THE TRANSFORMATION 


The only difference from the method of Kimball & Leach arises in relating I,( p,q) to the y? distribu- 
tion; afterwards y? is normalized in the same way as before. If y3,(a) is the y* value with 2q degrees of 
freedom and a probability « of being exceeded, we can put 





(q—1){q+1+4x20(a)} : 
bx2—(%) = = Nlog,a] 1 Sant <— +O(N-*) }..., (1) 
where N=pt+iq-} and a=I1,(p,q). 


The N-4 term in the bracket is about 1/400 when p = q and is proportional to (q/N)*. One should choose 
p > q, if not, p and qg can be interchanged since 1—a = J,_,(q,). For details, see Wise (1950). 
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In order to ‘separate the variables’, the y? term in the bracket in (1) must be replaced by an approxi- 
mation to it independent of « and x. In terms of the normal deviate y, (Campbell, 1923; Riordan, 1949) 


dxte(%) = @-44+4eG + hyat delys— Ty.) G+. (2) 


Hence we can put }y3,(«) equal to g or to g—}. The latter has the advantage of being almost exactly 
right at the 50% point. This approximation is a poor one for small g and extreme probabilities (very 
near 0 or 1), but the error is in all cases multiplied by the small factor (q¢— 1)/(24N?). A related approxi- 
mation was in fact chosen by Hartley & Fitch (1951) as the basis of their charts, and these give the 
incomplete beta-function ‘to graphical accuracy’. 

For normalizing x? the Wilson—Hilferty (1931) transformation is again used 

2 
(asin = Sit a... (3) 

2q 9q 3V4 

This distribution of the cube root of y? has coefficients of skewness y, = 4/(27q?) and of excess 
Yo = —2/(9q), respectively. (See, for example, Kendall & Stuart, 1958, § 16-7.) 

Substituting in (1) the approximation from (3) and the one obtained from putting y, = 0 in (2) yields 


. 1) = (q—1)(a+3)\ ]* 
Yo = ~3ya(1~5.) +3(—log, x) [» vali — Fon? 7 . (4) 
1 (q—1)(9+4)\+ 
y = nt | eee ra As. Moc 
Hence (—log,x)* has mean q (: x) (w 12N } 


and standard deviation nearly proportional to g-+N-}. 


3. NUMERICAL ACCURACY 


The largest source of error in (4) is generally the error in the normalizing approximation (3); for any 
one q and » this varies in quite a complex way over the range of probabilities « from 0 to 1. For a fixed 
q/p the error slowly decreases as g increases; when q only is fixed, it decreases with q/p because the x? 
term on the right of (1) has been replaced by g—4. At any one probability level a, of course, the two 
sources of error act sometimes in the same and sometimes in opposite directions. Table 1 shows exact 
and calculated values of « for some small values of p and q. (The x values substituted in (4) were the 
exact percentage points as tabulated by Thompson (1941) corresponding to « = 0-05, 0-25, 0-50, 0-75 
and 0-95.) The values of p and q were the smaller ones chosen for tabulation by Kimball & Leach, 
except for p = 2, q = 1. The errors are of the same order of magnitude as for their approximations, but 
vary rather more regularly. 


Table 1. Values of I,(p,q) = «, calculated from equation (4); x values obtained 
from Thompson’s tables are such that the exact a are 0-05, 0°25, etc. 


q > 2 2 2 + 6 10 
p> 2 5 12 5 12 12 
Exact a Calculated values of a 
0-05 0-0456 0-0486 0-0492 0:0464 0-0483 0-0464 
0:25 0°2453 0-2482 0:2489 0-2460 0-2479 02454 
0-50 0-5021 0-5020 0:4979 0-5003 0-5003 0-4993 
0-75 0°7555 0°7532 0°7534 0-7531 0-7520 0-7528 
0°95 0:9483 0:9477 0:9486 0-9503 0-9503 0-9514 


The comparison with their approximation / is of most interest, as this is the one that can be used in the 
dose-response problem. Table 2 shows this comparison for the same set of p and q, and with the values 
of x used by them. The errors in (4) are more often the smaller, although not always. 

For extreme values of «, nearer to 0 or 1, both approximations will generally have larger relative errors 
than the ones tabulated, but these extremes are less important in dose-response analysis. At any rate 
it seems clear that over at least 90 % of the incomplete f distribution the cube-root log transformation 
effectively normalizes it. 

We have also given in Table 2 the errors for Kimball & Leach’s approximation /*, although as they 
have pointed out this approximation cannot be used for ‘linearization’. 
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Table 2. Comparison of (4) with Kimball & Leach’s approximations f and I*. 
I = exact value of I,( p,q); In = approximation (4)t 


¢ => 2 2 4 2 6 10 
p> 3 5 5 12 12 12 
x 0-14 0-42 0-29 0-68 0-48 0-37 
104 7 0533 0510 0505 0473 0517 0481 
104 (f—1) — 85 +15 —34 +15 +03 — 53 
10! (Iy—1) — 4 —15 — 36 —07 —18 — 35 
104 (I* —1) — 21 —10 —16 —07 —07 - ll 
x 0-36 0-64 0-47 0-82 0-61 0-49 
10¢ J 2955 3006 2999 2920 2923 2989 
104 (f—1) — 74 +31 —8l +40 +04 — 152 
10* (Iy—1) — 3% —10 — 35 —06 —18 — 41 
104 (I* — I) +121 +20 +36 — 02 +08 + 17 
x 0-64 0-82 0-65 0-92 0-73 0-60 
10° I 7045 7044 7064 7206 7011 6914 
104 (f —1) =" 22 +50 —42 +59 +15 —119 
104 (Iy—1) + 46 +35 +30 +32 +19 + 22 
10* (I* —T) + 59 +39 +30 +32 +16 + 16 
x 0-86 0-94 0-81 0-97 0-83 0-71 
10° J 9467 9541 9524 9436 9452 9452 
104 (f—1) — 35 —233 —19 —16 —02 — 33 
10* (Iy—1) — If — 25 +03 —321 +04 + 15 
104 (I* —I) — 30 +27 —43 —22 —05 — 07 


f I would like to thank Dr Kimball for sending full details of his results. 
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On certain functions of normal variates which are uncorrelated of a higher order 


By R. G. LAHA anp E. LUKACS* 
The Catholic University of America, Washington, D.C. 


It is well known that two linear forms L, = a,%,+...+a,%, and L, = b,x,+...+6,2, in indepen- 
dently and normally distributed variates 2,, ...,x, are uncorrelated if, and only if, the relation 
ab’ = 0 (1a) 
* This work was supported by the National Science Foundation. 
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is satisfied. Moreover (1a) is equivalent to the stochastic independence of L, and L,. Thus the condition 
for the stochastic independence of two linear forms can be expressed either in terms of the coefficients 
as equation (1a) or in terms of the product moment of the forms as 


E(L,L,) = E(L,) E(L,). (16) 


A number of authors investigated the independence of two homogeneous forms Q, = XAx’ and 
Q, = xBx’ in independently distributed normal variates with zero mean and unit variances. We mention 
here only Craig (1943), Hotelling (1944) and Matusita (1949). These authors showed that Q, and Q, are 
independent if, and only if, 


AB = 0. (2a) 
A different condition, in terms of product moments, was given by Kawada (1950) who proved that the 
~~ BAM) = HQ) HSM) (= 1,255 = 1,2) (20) 


are equivalent to the condition (2a). Kawada’s condition can be formulated more concisely by intro- 
ducing the following terminology. 


Let x and y be two random variables and suppose that the ae E(a"y*) exists. We say that 
x and y are uncorrelated of order (7, s) if the relations 
Bay) = E(x*) Hy’) . (¢ = 1,...,73 9 = 1,...58) 
hold. In our terminology (26) means Q, and Q, are uncorrelated of order (2, 2). 
In this note we prove the following theorem 


THEOREM. Let 21, %,...,2%, be n independent normal variates with zero mean and unit variance. If 
the inhomogeneous quadratic form Q = xAx’+ bx’ and the linear form LZ = cx’ are uncorrelated of 
order (2,2) then cA = 0 and cb’ = 0. 


Proof. We note first that E(x?) = 1, E(x{) = 3, E(af) = 15. Using these values we obtain after some 
straightforward but tedious computations the following expressions 
E(LQ) — E(L) E(Q) = cb’, 
E(L*Q) — E(L*) E(Q) = 2cAc’, (3) 
E( LQ?) — E(L) E(Q?) = 4cAb’ + 2cb’ tr A, 
E( L?Q?) — E(L?) E(Q?) = 8cAA’c’ + 4cAc’ tr A + 2(cb’)*. 
n 
We write here tr A = > a,; for the trace of the matrix A. Since Q and L are uncorrelated of order (2, 2) 
i=1 
conditions (3) reduce to cA = 0, 
so that the theorem is established. 
It was shown by Laha (1956) that conditions (4) ensure the independence of L and Q. Therefore the 


independence of LZ and Q follows from the assumption that Z and Q are uncorrelated of order (2, 2). 
The converse statement is obvious and we obtain the following corollary to our theorem. 


cb’ = 0 (4) 


CoRoLuaRY. Let 2,22, ...,X, be m independent normal variables with zero mean and unit variance. 
The inhomogeneous quadratic polynomial Q = xAx’ + bx’ and the linear form LZ = cx’ are independent 
if, and only if, they are uncorrelated of order (2, 2). 

If the polynomial Q is homogeneous, that is if b = 0, then we can conclude from Kawada’s result that 
Land Q are independent if they are uncorrelated of order (4, 2). Our result shows that a weaker assump- 
tion is sufficient for the independence of L and Q. 

It would be desirable to obtain further results concerning the connexion between the order of un- 
correlatedness and the degree of two independent polynomial statistics in independent normal variates. 
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A further note on a simple method for fitting an exponential curve 


By H. D. PATTERSON 
Rothamsted Experimental Station, Harpenden, Heris. 


In1vRODUCTION 
The least-squares method for fitting the exponential regression curve 


é(y,) = a—fp*, where 0<p<1 (x=0,1,2,...,n—1) (1) 


requires the solution of a polynomial equation of degree 3n—8 in ?, the estimate of p. Stevens (1951) 
described a method for obtaining ? and the least-squares estimates of a and f by successive approxima- 
tion. Patterson (1956) pointed out that the polynomial equation can be written in the form 
n—l 
xX wl?) Ye 
r=,->———_ (2, = 0), (2) 


py w,(?) Yo-1 


where the w,(?) are polynomials of degree 3n — 9 and suggested that when full efficiency is not required 
a useful approximation to ? can be obtained very simply from 
» _ Wee 


= —————__ (lw, = 3) 
lw. Ye-1 ( ty ™ 


with the w, given by the w,(?) for some specified value of #; a and # can then be estimated by the 
regression y, on 7’*. Suitable values of the w, leading to estimates of moderate to high efficiency were 
given for the cases n = 4, 5, 6 and 7. 

This approximate method can be extended + larger values of n but, as pointed out by Patterson & 
Lipton (1959), at least two sets of coefficients w, are required to cover the entire range of p with reasonable 
efficiency. In practice the use of alternative estimates is likely to be inconvenient. 

The purpose of the present note is to present a modification of the method which leads to increased 
efficiency and closer approximations to ? without adding unduly to the arithmetic. 


MODIFIED METHOD FOR FOUR EQUALLY SPACED LEVELS OF & 


The modification consists of replacing the w, of (3) by functions u,+7v, chosen to give as good 
approximations as possible to the polynomials w,(7r); the resulting estimate, which will be denoted simply 
by r, is then one of the roots of the quadratic equation 


PL Ye-1 — (LV gY2— VU zY2-1) — UUs Yx = 0. (4) 
When n = 4 the w, and v,, can be chosen so that 
Ug + TyV_zX w,(";) 


for three different values r; = 11, 72,73. The estimate r is then equal to # for each of the three values. When 
the 7; are taken to be 0, 0-5 and 1 the values of u, and v, (with arbitrary factor) are 


x Uy Ve 

1 -—10 -17 

2 5 — 5 (5) 
3 5 22 


A measure of the degree of approximation involved is provided by the asymptotic efficiency of r 


when the y, are independent and have equal variance o*. Under these conditions the variance of r 
tends to 
(1+ 0%) 2(the + Pre) — 2PZ (the + Pa) (Yea + PUe-1) 7° 
{2(u,+ pvz) p*-1}? f? 
as o*/§? tends to zero; the corresponding variance of ? is obtained by substituting w,(p) for U,+ pl, 
in (6) or, more conveniently, from F.0? 
rr 








(6) 
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where F,, is a quantity defined by Stevens (1951). The efficiency ofr is very high (over 99-9 %) throughout 
the entire range of p. As 100 % efficiency can, with this estimate, only be obtained if r = ? the approxi- 
mation to ? appears to be a good one. 

As a further test of this point various inefficient estimates, including r, have been calculated from the 
data of seven replicated field experiments on the responses of barley to four equally spaced levels of 
nitrogen fertilizer and compared with the least-squares estimates *. The estimates considered, apart 
from r and ?, were 
(1) the ratio TP, 1.25 = “Vet Ya So 
; 442+ 41 — 5Yo 
proposed by Patterson (1956). 

(2) rr, the regression of y, on y,_,, where x = 1, 2,3. 

(3) ry, the estimate given by the method of Hartley (1948). 

(4) r,, the estimate obtained after one cycle of the iterative method of Stevens (1951), using rp ;.2; 
as the preliminary estimate. 

The estimates r, and ? were calculated on a high-speed computer. Estimates (1), (2) and (3) were 
discussed in considerable detail for the case n = 4 by Finney (1958). These three estimates, and especi- 
ally ry and rg, have high asymptotic efficiencies when n = 4. 

The experimental data and the least-squares estimates 7? are set out in Table 1. Table 2 gives the 
differences between each of the inefficient estimates and ?. In general r provides a much closer approxi- 
mation to ? than rp 1.5, 77 Or rq. It is also usually better than 7, so that replacing rp ,.., by r effectively 
saves at least one cycle of the Stevens method. 


Table 1. Experimental data and least-squares estimate of p 


Level of nitrogen 
A 





o=0 1 2 3 - 
Exp. Barley grain yields (ewt./acre) 
1 32-2 38-6 38-6 39-6 0.09476 
2 13-5 17-1 18-7 19-1 ‘39259 
3 29-7 34:3 36-7 37-2 -42670 
t 15-8 23-4 28-5 32-2 -69191 
5 29-0 35-0 37-6 37-7 -33292 
6 28-9 36-1 37-5 37-2 -14816 
7 26-6 32-1 34-2 34-6 *33750 


Table 2. Differences (multiplied by 10°) between inefficient estimates and ? for data of Table 1 


Exp. r—? TP 195—? rp —? rH—? r,—? 
1 69 3024 — 1664 2169 533 
2 2 85 215 — 85 —2 
3 2 275 191 —101 —19 
4 —2 —191 — 76 -1 —4 
5 1 — 124 190 — 226 10 
6 —18 — 874 150 — 465 51 
7 0 — 45 215 —116 2 


Closer approximations than ¢ are rarely likely to be required in practice, at any rate with data of the 
type considered here. The largest difference between r and # occurs with the data for which the fitness 
of the exponential curve is poorest (Exp. 1). The fact that these data have the smallest value of ? is 
coincidental. 


ESTIMATION OF pP WITH MORE THAN FOUR EQUALLY SPACED LEVELS 


Equation (4) can also be used to provide estimates of p for n > 4, but as the degree of the polynomials 
w,(?) increases with n the results cannot be expected to be as good as in the case n = 4. Nevertheless, 
by a suitable choice of u, and v, a high standard of efficiency and close approximation to can be main- 
tained, at least up to n = 12. 


a 
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ghout When n > 4 the u,+7;v, can be made exactly proportional to the w,(r,;) for only two values of 7;. 
proxi- j This will determine quantities c,u, and ¢,v,; the ratio c,/c, can then be chosen so as to minimize the 
variance of the estimate for a third value of r;. 
m the Suitable values for n = 5 are as follows 
vels of 
apart x Uz Vz 
1 —411 — 3-45 
| 2 1-27 — 5-54 (7) 
3 1-74 2-50 
i 4 1-10 6-49 


With these values of u, and v, the asymptotic efficiency of r is more than 99-9 % between about p = 0-12 
and p = 0-75 but falls to 99-0 % for p = 0 and 99-4% for p = 1. 








TP, 1.25 
Table 3. Values of u, and v, for n=6 to 12 
}) were n=6 
a Ug — 6-59 1:03 2-58 1:84 1-14 
Vy —6-29 -—11-71 —1-96 8-87 11-09 
res the 
yproxi- n=7 
tively Up — 7-88 0-20 289 2:44 1:45 0-90 
Vz —1:96 -—10-:94 —6-85 2-21 8-85 8-69 
n=8 
Uy —881 -—0-88 2-85 3-05 1-99 1:09 0-71 
f Vy 1:17 -—902 -—8-92 —-—2-70 4-31 8-21 6-95 
n=9 
Uy —9:47 —2-07 2-49 3-47 2-68 1:54 0-80 0-56 
Vz 3-45 —6-69 -—9-33 —5-77 0-08 5-18 7-41 5-67 
n=10 
Ug —9-91 — 3-24 1-86 3-62 3-30 2:19 1:14 0-58 0-46 
: Vy 4-53 —509 -933 -—7-:71 —3-00 2-21 6-04 7-22 5:13 
n=11 
Uz —10:27  —4-35 1-08 3°53 3-76 2°89 1:73 0-82 0-43 0-38 
Ve 6-39 —2:24 -—7-70 -812 -—5-26 —1-08 2-86 5-42 5-84 3-90 
\ n=12 
Uy —981 —5-40 0-09 3-04 3-90 3-44 2:35 1:30 0-57 0-28 0-24 
1 | Vz 657 —021 -619 —7:88 -647 -—3-29 038 3:40 516 516 3-37 
z | Table 4. Efficiencies of (¢) estimates of p obtained from equation (4) using the u, and v, 
3 of Table 3 and (ii) by Hartley’s method 
: n=6 n=12 
9 iz Be 
eae yy t Bs 
; —(n—1)Inp (i) (ii) (i) (ii) 
1 0 99-5 100-0 97-9 100-0 
2 } 2 100-0 100-0 99-9 100-0 
4 99-6 99-0 99-0 99-0 
6 99-4 97 99-2 96 
a of the 8 99-3 93 99-8 91 
fitness 10 99-2 90 99-2 84 
) of F is 12 99-0 87 97 79 
25 97 81 77 59 
} © (i.e. p= 0) 97 81 69 51 
10mials For n = 6 to 12 the values of u, and v, were chosen so as to give almost full efficiency in the region of 
theless, p"-1 = 0-216 and in the region of p"-! = 10-5. These values are set out in Table 3. The asymptotic 
2 main- efficiencies of the estimates are shown in Table 4 for the two cases n = 6 and n = 12; in the intermediate 


cases the efficiencies are of comparable order. For purposes of comparison the corresponding efficiencies 
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of Hartley’s estimate are also given. A feature of the Hartley estimate is its high efficiency relative to 
other inefficient estimates which have been proposed (Patterson & Lipton, 1959). The present series of 
estimates compare well with the Hartley estimate in terms of asymptotic efficiency. 

If desired the values in Table 3 can be rounded off to one place of decimals provided that this is done 
so that Xu, and Xv, remain zero. The rounding off usually results in some slight additional loss of 
efficiency. 

ESTIMATION OF P WITH FOUR LEVELS x=0, 1, 2, 4 


Most experiments on the responses of crops to fertilizers are designed with equally spaced levels of 
the fertilizers. Occasionally, however, the levels are taken to be proportional to x = 0,1, 2,4. In this 
case also a useful estimate of p can be obtained by solving a quadratic equation. A suitable equation is 


4(Y2—Y1) 7? + (4¥2+Y1 — 5Yo) 7 — (444+ Y2— 5y1) = 9. 
When r = 0-3 or 0-7 the estimate is almost identical with ?. The asymptotic efficiencies of r are as follows 


p 0 0-1 0-2 0-3 0-4 0-5 
Efficiency (%) 89-3 96-0 99-4 100-0 99-6 99-4 
p 0-6 0-7 0-8 0-9 1-0 
Efficiency (%) 99-7 100-0 99-9 99-1 97-8 
SUMMARY 


A method is presented for estimating p in the equation y = a— fp* for up to n = 12 equally spaced 
values of x from the root of a quadratic equation in r with coefficients given by linear functions of the y,. 
Unless p is very small and n large these estimates have very high efficiencies and are good approxi- 
mations to the least-squares estimates ?. The agreement with ? is particularly close when n = 4. Details 
are also given of a similar method for the case in which x = 0, 1, 2 and 4. 


I wish to thank Mr F.. V. Widdowson for providing the experimental data of Table 1. 


REFERENCES 
Finney, D. J. (1958). The efficiencies of alternative estimators for an asymptotic regression equation. 
Biometrika, 45, 370-88. 
Hartitey, H. O. (1948). The estimation of non-linear parameters by ‘internal least squares’. 
Biometrika, 35, 32-45. 
Patterson, H. D. (1956). A simple method for fitting an asymptotic regression curve. Biometrics, 
12, 323-9. 


Patrerson, H. D. & Lipton, S. (1959). An investigation of Hartley’s method for fitting an exponential 
curve. Biometrika, 46, 281-92. 


Stevens, W. L. (1951). Asymptotic regression. Biometrics, 7, 247-67. 


Estimation of a parameter in the classical occupancy problem 


By COLIN R. BLYTH} ann GREGORY L. CURME 
University of Illinois, Urbana 


1. Introduction and summary. The following mathematical model has been used [see, for example, 
Rashevsky (1955), Sprott (1957), Thomas (1956), Curme & Harkness (1957)], for certain biological and 
combat situations. Each of the r members of an attacking force independently of the others attacks the 
ith member of a population of n individuals with probability 1/n (¢ = 1, 2,...,). Each attacker has the 
same probability p of killing the individual he attacks. Let X be the (random) number of population 
members surviving the attack. Writing A, for the event that the ith member of the population survives, 
we have 


P(Ay,...»A,) = (1-Z») (k = 1,2,...,n). 


t Work supported by the National Science Foundation. 
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Then as shown by Feller (1950, p. 69) 


x =a) = (")"S(- ("> *) (1-9 (a = 0,1,...,). (1) 
= 


Note that when r < n —_ 0 <p <1 this probability is zero for x = 0,1,...,n—r—1 and positive for 
2x=n—r,n—rt+l,. 

This note is pl Baaeel with the problem of estimating p from X. For the case r < n the uniformly 
minimum variance unbiased estimate of p is obtained and its variance computed. For the case r > n 
(in which no unbiased estimate of p exists) this same estimate has uniformly minimum variance among 
estimates having minimum bias in the neighbourhood of p = 0. 


2. Factorial moments. The factorial moments of the random variable X with distribution (1) are easily 


written down x 
wee $,0() Sur) (-225) 
v n 
n —k »(%-2 a+v \f 
=m Be (Tar ("S) (1-2) 
n n a n-x a4 r 
no BB (aa) (22) (1-32) 
- Pex -(1-J2) «-o (O72) (2) 


i n—k —k 
enor -40) (ot) Sh 
n»(1-=) (e= 1, 3,..:.). 


k r 
That is, EX® = nto( 1p) (k = 1,2,...). (2) 
n 


From (2) the moments of X can be written down using the usual formulae for moments in terms of 
factorial moments. Sprott (1957) finds the first two moments of X using indicator variables and this 
method is used by Barton & David (1959) to find the factorial moments. For the purposes of this note 
it is easiest to work with the factorial moments only. 


3. Completeness of the family of distributions. The family of distributions (1) with 0 < p < 1 is com- 
plete; Saat Ef(X)=0 implies P(f(X) = 0} =1. 
p 
The proof of this fact is as follows 
= n—-x w+v \r 
epx) = 3 pe (*) 5 -r("5*) (1-2), (3) 
z=0 % v 


which is a degree r polynomial in p. Thus Hf/(X) = 0 implies that every coefficient of this polynomial 
must be 0. The constant term is 


& se (") 5 -ar(">*) =s0 =0. 


The coefficient of p is therefore 


n—1 n—-2x = 
= se)(") 5 (-1r(">*){-Le+} = ’ 


n—1 n—x - 
i.e. » F(x) (*)’s > (—ay(" yy = 0, 
-2-1 
ive. "S10 (" a, S-( )=0, 
i.e. nf(n—1) = 0. And in general, if f(n) = f(n—1) =... = f(n—k+1) = 0, the coefficient of p* is 


n=k n\ "=" y(2—-2\ (r\ (zt+v\* _ 
Rez r(IVOME) -° 
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which implies 


S10 (" > > (-ay(" i 4 aya) D4... -a,(z)} 


z=0 


SJ) 


= 
n—k 2 —xz—k 
Zo (c)e—em" Ser (77) 
= (—‘1)F¥n™f(n—k) = 


We can now conclude that f(n) = f(~—1) = ... = f(n—r) = 0, so that P{f(X) = 0} = 1 and the family 
of distributions of X is complete. 


4. Unbiased estimation of p. Because of completeness, the problem of unbiased estimation of any 
function of p can be solved by the method of Lehmann & Scheffé (1950). From (3) we observe that 


= n k r 
E(x) = ¥ es(1-"2) (4) 
k=0 ” 
where ¢,¢,,...,C, are constants. Therefore no functions of p other than such linear combinations of 
{1—(k/n) p}’, k = 0,1,...,m possess unbiased estimates. And from (2) we have 
n ae n k \r 
EY a—>= = 2 | o(1-" 2) (5) 
k=o0 =n 


The Lehmann & Scheffé (1950) conclusion now is that > c, X/n™ is the unique uniformly minimum 
n k r k=0 
variance unbiased estimate of >> ox(1 -*») , where Co, Cy, ...,¢, are arbitrary constants. To find the 
k=0 n 
variance of this estimate we note that 


XOX® = 


~_— yf ad i Xtk+j—0), 


t=0 6 


xw® k \") (x j \r 
oS2-(-20)]-(-£) 


Tae sats nikt+i-) k+j—t \" k \r i \" 
tan —{1-- 1—-p). 
~ th n®y (1- n ») ( =P) ( *») 


which gives 





n 
The variance of x c, X/n™ is therefore 
k= 


n xe 
AS ole (-a) J) 
min (J, k) HOA n(k+5— k+j—t \" k \r j \" 
- > E ee 2, OT ag (3 sa ») - (1-52) (1-22)|. (6) 


In particular, we are interested in estimating p itself. For there to be an unbiased estimate of p we must 
be able to find constants Cp, ¢,, ...,¢, such that 


n k r 
¥ «,(1-Ep) =p. (7) 


k=0 
Equating coefficients of powers of p, (7) becomes 





Cy = —(C, +... + Cy); 
and C+ 2c, +...+ nC, = —n/r 
at Pat. + © =0 (8) 


c+ Wea t.. hina te = 0. 
If r > n the equations (8) are not solvable, because the last n of them have the unique (their deter- 
minant of coefficients is of Van der Monde form and clearly non-zero) solution 0,...,0 which does not 


satisfy the second equation. On the other hand, equations (8) have a unique solution if r = n and many 
solutions if r < n. 
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When r <7, one solution is to take ¢,,, = ¢,,. = ... = C, = 0; equations (8) then reduce to the 
following uniquely solvable form. 
or Cy = — (Cy +--- +€n)3 (9) 
and C,+ 2c, +...+7¢, = —n/r 


C,+ 2%c,+...+7%c, = 0 (10) 


C+ Weg t... +9; = 0. 


The determinant of coefficients of equations (10) is of Van der Monde form, so can be readily evaluated 








Le oe asc eo | 1 l ae 1 | 
1 2 ise . =ri|! 4 ee : | 
oe cel fie ret 
= r!]r-12'-2... (r—1) = [17 2°-137-2... r]. (11) 


Using Cramér’s rule, the solution for c, in equations (10) is the ratio of two determinants of which the 
denominator is (11) and the numerator is (11) with its kth column replaced by the column —7/r, 0, ..., 0. 
This numerator is easily reduced to Van der Monde form and its value found to be 


n{[r 
(-oS(;)o Qr-13r-2 sr], 


Equations (10) therefore have the unique solution 


n [r 
% = (—1)F¥— b= 1,2,....,7). 12 
v= (-UFS (7) ( r) (12) 
The unique uniformly minimum variance unbiased estimate of p for the case r < n is therefore 
r n{fr xX® 
— -1 nn ———1}. 13 
2-0) Ge? ] 


Another solution of equations (8) is obtained by appending additional equations 


C+ 2+1¢,4+... +n" tc, = 0 


¢, + 2" cy +...+ n™ Cn =0 


and then solving the whole set in the same way that (9), (10) were solved. This shows (again for the case 
r <n) that the uniformly minimum variance unbiased estimate for p is 


n ni[{n xX® 
a+ = 3 (-5 (2) aa: (14) 


Because of completeness, Z and Z* must coincide with probability 1; this can also be checked directly. 
We may use whichever form we find convenient. 
From (6) the variance of Z is 


* 2 _ n® (r\ (r\ (ming, *) FORO yik+i—-9 k+j-t ) k y j )| 
— ])*+i es: all oe me. 1-2 
Po 2 Sina rk (;) (;) PI t! n®y@ ( i ( ar ( re 


8 r r [r\ [¢ min (j,k) (7 — 1)¢t-D (k— 1)t-D neti—-9 _k+j-t )-{ r 3 = (i) 
22,2, ()() 2h AnH P(n— ( ee ae 
(15) 


For fixed r we have the following limits as n — oo: (i) The variance (15) of Z tends to p(1—p)/r; (ii) the 
value of Z tends to (n— X)/r with probability 1; (iii) the distribution of n — X tends to binomial (7, p). 
All these limits are what we would expect since the probability is tending to 1 that r different individuals 
will be attacked. 

An unpleasant feature of the estimate Z is that it can exceed 1 for small values of X, being quite large 
for n = r large and values of X close to 0. This is only a slight drawback owing to the very small pro- 
babilities of such values of X even when p is close to 1. We would in practice modify Z by using instead 
min (Z, 1). This modified estimate will be slightly biased and will have slightly smaller variance and 
slightly smaller expected squared error than Z. 
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5. Estimation of p with locally minimum bias. In this section we will consider the case r > n, in which 
there is no unbiased estimate of p. An estimate of p with expectation 


n k r 
g(p) = 2 o4(1 ~~») 


is said to have minimum bias in the neighbourhood of py if g(p) and as many of its derivatives as possible 
agree at p, with the function p and its derivatives. The unique (and therefore uniformly minimum vari- 


n 
ance) estimate >) c,X/n™ with this property is found by solving the equations 
k=0 


Po) =P» 9 (Po) =1, 9 (Pe) =9, ---» GF™(Po) =O for Co, cy,..., Cy. 


These equations are 
{ 1 4 a Pee eo. 
Cot (1 -* ps) a+(1 ~* »») Cyt... + ( -" ps) Cn = Po; 
1 (r—1) 9 \r-1 n r-1 
and (1-59) e+2(1-=po) cyt... tn(1—"p9) Cc, = —n/r 


1 \@-2 Q \r-2 n \t-2 
(1-».) a+ 2*(1-=py) extn tnt(1—"p,) c, = 0 
n n n 


1 r—n 2 r—n “ rn : 
1-—- 2"41-——- $n l—— = 0. 
( Po) q+ ( =») Cot... +n ( * ps) Cn 


The determinant of coefficients of equations (16) is of Van der Monde form, so can be readily evaluated 


(16) 




















1 ee 1 
1 2 n 

1 2 : n 

it t<s _— i-- 

1 n r-1 Po Po Po 
Hil-——#,)...41-— “ . ” 
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=| 1-= ) 1—"p,)) 1". 2-1 17 
= Po a ( -"»»)| ; = (17) 


Using Cramér’s rule the solution for c, in equations (16) is the ratio of two determinants of which the 
denominator is (17) and the numerator is (17) with its kth column replaced by the column —n/r, 0, ..., 0. 
This numerator can be reduced to Van der Monde form and its value found to be 


/ 1 r—n k iad 
(—O5(f) {(1—F0) -(1-Zae) ae 20.-n(1-Ep9) 


Equations (16) thus have the unique solution 





n [n k of 
— l | cam exo = eco . 
Cy ( ) s(t) ( « »0) (k 1,2, ,n) (18) 
The unique uniformly minimum variance estimate of p with minimum bias in the neighbourhood of 
Po is ‘herefore ile 
x n (n\ [X/n™ —{1 —(k/n) po} 
Z* = — 1)k¥— ‘ § 
(Po) = Po +2 (—1) ok (:) {1—(k/n) poy" (19) 


Note that in the special case p, = 0 this reduces to Z* of §4. The special case p, = 1 must be handled 


separately—only derivatives up to order n—1 can be made to agree. The variance of Z*(p,) can be 
written down from (6). 
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In the case r > n we may then choose a number py near which we think @ lies, and then use the estimate 
Z*(p 9). Alternative low bias estimates would be to choose a g() which in. some sense (such as maximum 
discrepancy) agrees most closely with the function p, and use the uniformly minimum variance unbiased 
estimate of that g(p). 

As to other estimates of p: a minimax estimate for p can be found numerically for small values of n, r. 
The method we used (discrete a priori distributions) works for all , r but is too laborious for all but the 
very smallest values. The value of the maximum likelihood estimate of p for a given value of X involves 
the location of the maximum value on the range 0 < p < 1 of a degree r—1 polynomial in p. For the 
trivial case r = 1 and for r = 2,3 this is easily done (the risk function, squared-error loss, is not par- 
ticularly good for these maximum likelihood estimates). But for r > 3 the maximum would have to be 
located numerically, which is too laborious. 


6. Example. A flight of n missiles is attacked by r interceptors, each of which is equipped with infra- 
red guidance. The interceptors are guided to the target area by radar and then each seeks a target. We 
assume that each interceptor chooses a target at random from among the n missiles, has probability p 
of destroying it and probability 1 — p of missing it and passing out of the target area. From the number 
X of missiles surviving this attack, we wish to estimate p, which measures the effectiveness of the 
interceptors. 

If n = 5, r = 4, formula (13) gives the uniformly minimum variance unbiased estimate for p; when 
X = 2 this estimate has the value 0-98. 

If n = 4, r = 5, no unbiased estimate of p is possible. Supposing values of p close to 1 to be of most 
interest, we may decide to use formula (19) with py, = 1 which gives the uniformly minimum variance 
estimate with minimum bias in the neighbourhood of p = 1; when X = 1 this estimate has the value 0-80. 


The results of § 2 were obtained by one of the authors working jointly with W. Harkness. 
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Geometry and linear discrimination 


By C. W. CLUNIES-ROSS anp R. H. RIFFENBURGH 
Virginia Polytechnic Institute and University of Hawaii 


1. OUTLINE OF THE PROBLEM 


The problem of discrimination is a problem of classification, viz. on the basis of a set of measurements 
on an individual that individual is to be classified into one of a set of populations. Although many rami- 
fications are possible this paper will be concerned only with the situation where there are just two 
possible populations and both of these populations are multivariate normal with respect to the set of 
measurements characterizing a random member of the population. 

The criterion used in defining the best discriminator depends on the situation; however, for a wide 
class of situations the criterion used is the maximum of the, possibly weighted, conditional probabilities 
of misclassification. The weights would be either prior probabilities or loss functions, but not both. 
Another criterion applicable to certain situations is the sum of the, possibly weighted, conditional 
probabilities of misclassification. The weights here would be the loss functions if relevant loss functions 
exist and are known. Discrimination based on these two criteria will be referred to as ‘minimax’ and 
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‘minisum’ discrimination, respectively. This paper will be primarily concerned with minimax dis- 
crimination. 

For both types of discrimination, and indeed for any reasonable type of discrimination, the best 
discriminator is one of the surfaces of constant likelihood ratio. When the two populations are both 
multivariate normal these surfaces correspond to a family of similar and similarly situated hyperconics 
in the spaces of measurements. Except for special cases the identification of the best member of the 
family (for minimax discrimination) and its evaluation (for minimax or minisum discrimination) are 
mathematically intractable. 

This paper will investigate the best discriminator subject to the restriction that it is linear in the 
measurements, i.e. is a hyperplane in the measurement space. Much the same problem was the topic of 
another paper by the authors (Riffenburgh & Clunies-Ross, 1960). That paper considered the optimum 
constant associated with such a hyperplane, given an arbitrary vector of direction numbers. The current 
paper considers the determination of such a vector. 

Throughout the paper the following notation will be used: N(M,=) means either a multivariate 
normal population with means M and dispersion matrix = or the corresponding density function. The 
weights will be W,, W,, with W, > W,. The two populations are then 


N(M,,2,), N(M,,2%,). (1-1) 
If Z,, Z, are non-singular then there is always a non-singular linear transformation of the measurement 
space to make the two populations N(O,1), N(M,D-), (1-2) 


where D is diagonal with diagonal elements (d,;) and 0 < d, < d, < ... < d,. (There is no real loss of 
generality in assuming £,, Z, to be non-singular.) 


2. GEOMETRY AND SOME TERMINOLOGY 
The family of hyperconics 


(X—M)’ A(X—M) = 0 (2-1) 
represents the constant likelihood contours of the N(M, A-?) distribution. The member of (2-1) through 
a specific point Y is (X —My A(X—M) = (Y—M)’A(Y—M). (22) 


The tangent (i.e. tangential hyperplane) at Y to (2-2) is 

(X—M)’ A(Y—M) = (Y—M)’A(Y—M), 
i.e. (X—Y)’ A(Y-—M) = 0. (2-3) 
This tangent will be referred to as the tangent at Y to the family (2-1). This is permissible, it always 


exists, and it is unique except for the case where: Y = M in which case any hyperplane: (K — M)’B = 0 
may be regarded as a tangent at M to the family. 


For certain points Y(t) the tangents at Y to each of two families of the type (2-1) are identical. Using 
the notation (and the corresponding assumptions) of (1-1) and (1-2) the families are 


x’X = 0, 
(2-4) 

(X—M)’D(X—M) = 8. 
The tangents at Y to (2-4) will be (X-Y)/Y = 0, (2-5) 
(X—Y)’ D(Y—M) = 0. (2-6) 

These will be identical if and only if there exists some constant ¢ such that 

tY = D(Y-—M), (2-7) 
which determines a locus for Y, viz. Y(t) = (D-#I)-1 DM. (2-8) 


This locus will play an important role in what follows and will be referred to as the fully common tangent 
locus although, strictly, it is the locus of the points of contact of fully common tangents. The tangent 
corresponding to a particular value of ¢ (or a particular point Y on the fully common tangent locus) 
will be referred to as the fully common tangent at ¢ (or at Y). 

The fully common tangent locus (2-8) consists of a set of branches; each branch superficially resembles 
a rectangular hyperbola except that it is not coplanar. On examining the positions of the locus with 
respect to ¢ the following facts are demonstrable 


t=0 Y(t) = M, 
t=+o0 Y¥(t)=O, 
t=d, Y(t) is infinite. 


Cg 
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The locus is continuous in ¢ except at ¢ = d; and has as many branches and asymptotes as there are 
distinct d; associated with non-zero m;. 

The ranges (d,, dz), ..., (d,_1,4,) of t are each associated with one branch of the locus. The split range 
[(d,, 00) (— 00, d,)] of t corresponds to the remaining branch of the locus which passes through O, M. 
Further results pertaining to the fully common tangent locus are embodied in the following Lemmata. 


Lemma 1. The necessary and sufficient condition for a fully common tangent at ¢ to pass between 
O, M is that t be negative. 

Proof. The N and S condition may be expressed as: the left side of (2-5) for M, i.e. (M—Y)’Y, and 
—Y’Y be of opposite sign; i.e. that (M—Y)’Y be positive. Substituting Y(¢) from (2-8) the condition 
becomes that: M’(I—(D-—iI)-!D)(D—#)-!DM be positive. The matrices are diagonal, so that the 
condition may be written as: —tM’(D — tI)-1 D(D — #I)-! M be positive; i.e. that ¢ be negative. (Note that 
Y¥(0) = O; ¥(+ 00) = Y(—00) = M.) 

Lemma 2. Any hyperplane passing between O, M intersects the fully common tangent locus at, at 
least, one negative ¢. 

Proof. A general hyperplane may be expressed as X’B+C = 0. An N and S condition for it to pass 


between O, M is that: C, M’B+C be of opposite sign. Substituting (2-8), the intersections of the hyper- 
plane with the locus are given by the roots of 


wy bimid; 
“<— d,—t 


The left-hand side of (2-9) is equal to C when t = —oo, equal to }}b;m,;+C = M’B+C when ¢ = 0. 


+0=0. (2-9) 


t 
Further it is continuous in ¢ for ¢ negative. Therefore (2-9) has at least one negative root. 


Lemma 3. The fully common tangent at ft, (tg < 0) does not intersect the fully common tangent locus 
at any other negative ¢. 
Proof. Intersections for ¢ (as in (2-7)) where t + t) are given by the roots of 


Dili d;(t — ty)/(d;—to)* (d;—t)] = 0. (2-10) 


All terms in the summation of (2-10) have the same sign for any negative ¢. 


3. TANGENTS AND PROBABILITIES 


Any hyperplane (K—X,)’ B = 0 divides the measurement space into two regions. If there is an 
N(M, A-") distribution over the observations, then the probability that an observation will lie in each 
of the two regions is given by 


N(M, A-) dX; oan N(M, A-") dX. (3-1) 
(X—X,)’ B>0 (X—X,)’B<0 
a fo 6] 
These may be expressed as | N(0, 1) dx; | N(0, 1) da, (3-2) 
—c2 a 
where a = (M-X,)’ B/(B’A-"B)!. (3-3) 


Lemma 4. The hyperplane through Y maximizing the probability that an observation from an 
N(M, A-1) distribution lies on the same side of the hyperplane as M is (2-3), the tangent at Y to the 
family of constant likelihood surfaces. 

Proof. The probability in question is 


max| {" N(0, 1) dz, [Pe 1dr |, 


where a is defined by (3°3) with Y substituted for X,. Thus the probability is maximized when In (a*) 
is maximized with respect to B. This is simply obtained by finding 
éin(a*) 2(M—Y) 2A"B 
2B m o? 





, (3-4) 


where m, o are the numerator and denominator of a. On equating the partial derivative vector (3-4) 


to O we find B = (o?/m) A(M—Y). (3-5) 
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As B need only be known to within a scale factor the term (o?/m) may be omitted from (3-5). This result 
does in fact represent a maximum; the minimum, a = 0, is obscured by the logarithm and corresponds 
to a hyperplane through M. The hyperplane required is therefore: (K—Y)’ A(M—Y) = 0, which is 
equivalent to (2-3). 


Lemma 5. The hyperplane which simultaneously 

(i) effects a fixed division of probabilities with respect to one multivariate normal distribution; 

(ii) maximizes the division of probabilities with respect to another multivariate normal distribution; 
is a fully common tangent to the two families of constant likelihood contours. 

Proof. The transformed form (1-2) may be used (with corresponding assumptions) with the distribu- 
tion of (i) being N(O, I). 

The class of hyperplanes satisfying (i) are the tangents to 

xx =O, (3-6) 

where C is a constant whose value depends on the division specified in (i). This follows from the fact 
that every hyperplane is tangential to one of the family of constant likelihood contours, i.e. can be 
expressed as (X—X,)’X, = 0 (3-7) 
for some Xy. This is equivalent to saying that X, can be chosen in (3-7) such that this tangent plane 
divides N(O,1I) into the preassigned probabilities. 


The tangent (3-7) divides the space into two regions; the probability that an observation from the 
N(M, D-*) distribution will fall on the same side of the tangent as M is 


b re) 
max [| N(0, 1) dz, I N(O0, 1) ae | 7 (3-8) 
—o b 


where b = (M—X,)’ Xo/(XoD-*X,)!. (3-9) 


Thus (3-8) is maximized when !n (6?) is maximized. To fulfil the conditions of the Lemma this maximum 
is restricted by the fact that X, must satisfy (3-6). By the use of Lagrange multipliers and the regularity 
of In (b?), maxima, minima, and some points of inflexion of In (6?) occur when 


ain (b*) AAXoX,—C) _ 


_— te ” (3-10) 

and XoX,—C=0. (3-11) 
2(M — 2X 2D-1X, 

(3:10) may be expanded as ee ee 2AX, = 0. (3-10a) 


(M—X,)’X, (X$D-*X,) 
If X, = Y(t) subject to (3-11) where Y(t) is defined in (2-8), then 
(M—X,)’ X, = —¢M’(D —#1)-! D(D—1#I) 1M = —tX,D—'X, 


and M — 2X,+¢D-'X, = (I— 2(D—A)*D+iD-'(D-#1)"D)M 
= —(D-d#)-!DM = —X;,; 
thus with A = —t(M’(D—-?#I)-!D(D—#)-1M) 


yield solutions of (3-10); i.e. points of the fully common tangent locus satisfy (3-10). To prove that the 
maximum is necessarily on this locus, (3-10a) is premultiplied by X4 and (3-11) is used to obtain 


‘oc (3-12) 
(M—X,)’X, 
Using (3-12) in (3-10) and simplifying leads to 
(X— Mb) DX, 
(M—X,)'X, ~ XjD7X," wei 
Redefining M-—X, = —‘D-1X,, (3-14) 


where ¢ is the ratio of scalar denominators t = —(M-—X,)’ X,/XgD-'X, which leads to the known 
relationship (2-8). 

As in Lemma 4 there are some minima obscured by the logarithm, which are those tangent to (3-7) 
passing through M;; none of these tangents are real when X¢X, > M’M. 

Apart from these last tangents we have, in the situation where m, are non-zero and d; are distinct, 
found as many roots as there are dimensions. It has not been verified that there are no other roots. 
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4. SOLUTION FOR CERTAIN CASES 


The mathematical description of the cases covered in terms of the standardized problem and of the 
original parameters will be given at the end of this section. The geometric definition is that the hyper- 
plane corresponding to the optimum discriminant passes between the centres of the populations, i.e. 
that the optimum discriminator discriminates between the means of the two populations. 

The hyperplane corresponding to the discriminator will intersect the locus of fully common tangents 
at at least one Y(t), where ¢ is negative (Lemma 2). The hyperplane is the fully common tangent at that 
point; because of the hyperplanes through Y(¢) it simultaneously minimizes both of the conditional 
probabilities of misclassification (Lemma 4). Lemma 3 establishes that no paradox will occur. Although 
this fully common tangent may intersect the locus of fully common tangents at Y(t) for some positive ¢, 
the fully common tangent corresponding to this positive value of ¢ will not pass between the centres, 
and so one of the conditional probabilities of misclassification would be greater than }. For the fully 
common tangent corresponding to the negative ¢ both conditional probabilities of misclassification are 
less than 4 (Lemma 1). 

The determination of the value of ¢ can be achieved only numerically. It should be noted that 


(i) Y(t) Y(t) = D mid? /(d;—1t)? is monotonically increasing as t increases from — 00, 0; 


t 
(ii) (Y(t) —M)’ D(¥(t) —M) = #25 m?d,/(d;—2)? is monotonically decreasing as t increases from — 00 
to 0. é 
The two weighted conditional probabilities of misclassification are therefore oppositely monotonic 
in ¢ for t negative. As they are both continuous functions of ¢ the minimax will be achieved when they are 
equal. Thus numerically the problem becomes that of finding the unique negative root of 


W, a N(0,1)dx = w,{” N(0, 1) da, (4-1) 
att) Ata) 
where a(t) = [midi /(d,—t)*}, ‘ (4:la) 
b(t) = mi d,/(d;—t)?}. (4:16) 
t 


The condition for the optimum discriminant hyperplane to pass between the centres is the condition 
for (4:1) to have a negative root. If there is a negative root then this indicates a restricted minimum 
which, by an argument analogous to that which eliminated the possibility of paradoxes, is an un- 
restricted minimum. 

The condition that there be a negative root is that 


{Wr - w,{” N(0, 1) ax} {aw.- wf” N(0, 1) ae} > 0, (4-2) 
b(—~) a(0) 
where b( — 00) = [DY mF d,}, (4-2a) 
a 
a(0) = (Sm? }}. (4-26) 
i 
In the original parametrization of N(M,,2,), N(M,,2Z,) then 
b( — 00) = [(M,—M,)’=2(M,—M,)}#, (4:3) 
a(0) = [((M,—M,)’2;1(M,—M,)}}. (43a) 


5. OTHER CASES 


When the condition (4-2) is not satisfied the problem becomes more complicated. The best discrimin- 
ating hyperplane will not pass between the centres. Minimax discrimination will ‘favour’ the population 
with the greater weight factor. (If the weight factors are equal then (4-2) is satisfied except when M = 0.) 
From Lemma 5 it follows that the best discriminating hyperplane is still a fully common tangent. The 
minimization of W,P,+ W,P, also leads to the family of fully common tangents. Thus both, and indeed 
any reasonable, criterion for linear discrimination leads, for the multivariate normal populations, to 
picking one of the family of fully common tangents. The determination of the eppropriate one can be 
achieved only numerically. 
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Approximation to the distribution of the sample size for sequential tests. 
Il. Tests of composite hypotheses 


By D. H. BHATE 
University College London and Delhi School of Economics, Delhi 


1-1. Introduction. The original method of construction of sequential probability ratio tests given 
by Wald (1947) applied to discrimination between two simple hypotheses. Wald extended his procedure 
to tests discriminating between composite hypotheses by introducing a rather arbitrary weighting 
function, by means of which the problem was reduced to choice between simple hypotheses. Armitage 
(1947), Barnard (1952) and Cox (1952) gave, in different ways, direct probability justification for using 
appropriate likelihood ratios for comparisons of certain composite hypotheses. Nandi (1946) and 
Bahadur (1954) have also discsussed this problem. Rushton (1950, 19524, b) studied sequential versions 
of the ¢t and F tests. Johnson (1953, 1954) considered sequential tests in general analysis of variance 
problems. Ray (1956) gives a detailed discussion of the application of these tests in one-way classifica- 
tions by groups and in randomized blocks. 

In general, we do not know much about the distribution of sample size, or even indeed the average 
sample size (a.s.n.) for these sequential tests. In this paper the methods used by Bhate (1959) in studying 
the distribution of the decisive sample size (d.s.n.) for sequential tests of simple hypotheses will be 
extended to tests of composite hypotheses. 


2-1. Let us suppose that II, and IJ, are two Normal populations with means /,, #4, and variances 
oi, o%, respectively, with A = o{/o3. We wish to discriminate between the (composite) hypotheses 
Hy: A = A, and H,: A = A,. If we wish to control (approximately) the chances of error by making 


Pr {accept H,|H,} =a, Pr{accept H,|H,} + ~ 


an appropriate sequential test is to measure one individual from each of the two populations at each 
stage, giving measurements 21,,, Xz» from II,, Il,, respectively, at the mth stage, calculate 


m ¥ ™m - 1 ™ 
P= (21 Fin)*/ D (%a:—Zan)*, where Z._,=— >) ty, (t= 1,2) 
i= i=1 Mij=1 
and (i) continue to the (m+ 1)th stage if F;, < F, < Fy, 

(ii) accept Hy if Fn > Fn, 

(iii) accept H, if PF, < Fm, 
where F7,, F;, are, respectively, solutions of the equations 

gf A-F_ _ 1+ Fyp/Ag\™— 
l-a”’ a  \A, 1+F,,/A,)  * 


Upper and lower bounds for the cumulative distribution of the d.s.n. (Bhate, 1959, equation (2-2-4)) 
are given vy the inequalities 


m m—1 m—1 
Pu < YD P< Pmt Y PisNimt Y PirFim (2+1+1) 
i=1 i=1 i=1 
where (a) p,; = probability that a decision is reached at (but not before) the ith stage; 
m 
(ic. > p; = probability that the d.s.n. is less than or equal to m); 
_— Po re) 
0) Pn =f pF adPnt [pad AP 


(c) p,4( pir) = probability of deciding to accept H,(H,) at the ith stage; 
and (d) im(Sim) = the largest possible value for the conditional probability that, a decision having 
been reached at the ith stage, (m—i) further observations would produce a 
value of F,,, such that F;, < F,, < F%. 
The quantities /;, and Ff are fixed numbers and can be calculated directly. The lower bound of the 
cumulative distribution of the d.s.n., namely 


‘on es) 
Pu i) PF n|A) dF + | p(Fp|A) dF, 
0 Fx 


is a function of the true value of A = oj/03, and is easily calculated. 
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The calculation of the upper bound is more complicated. It calls for splitting up p; into p,4 and p;Rr, 
and evaluating 9; and 84m. 
Under certain conditions of symmetry, and supposing a = f, we have p,4/p;r constant with respect 


to i, so that when H, is true 
PiaA=(1—-a)p;, Pir = ap; 
and when H, is true 


PiA=Ap, Pin = (1—a)p;. 
Fr 
Calculation of im: Nim = max i) PAF m| Fi) dF. 
Fi<F; JF, 


It is evident that the maximum value of the integral is attained when fF; = F;. Hence we must 
evaluate 


Ft 
tn = i 9p POE m | FF) EP 


Now Fy, = Ax? m—1/X3, m—1» Where the numerator and denominator contain independent y*’s, each with 
m— 1 degrees of freedom. With an obvious notation, we can write 


— 4 = -1+XiL m—i 





XB i-1 +X3,m-i 


all the y*’s being mutually intependen. Given that Ay? ;_,/x2 ;-1 = Fi, it is straightforward to show 
that the conditional distribution of y3 ,_, is that of (1+; /A)- times a x?-variable with 2(i—1) on 
of freedom. We will denote this by (1+ F7/A)- x3.;-»- This quantity is of course independent of x? ,,_; 
and y3,,-;- Hence 








/A - 
Pr{F, < Fa| Fy = F7} = Pr (nk vy +Xiim-» SAF ate »y +X3im- 
(21-2) 
and so, ifA = 1, 
2 5 
Pr{F, < Fz|F, = F7} = Pr | — Xi m—i <1). (2-1-3) 
ne 2 + F- 2 : 
14+F; XaG-1 mX2,(m—-i) ] 
Using the approximation 
Aix, tAaXs, AIX, 
with 9 = (ALM + ARM) (ALM +AgMe), Y= (Arr +Agre)*/(Ai v, + AGM) 
(see, for example, Welch, 1938; Box, 1954), we have 
Pr{F,, < Fz| F; = Fz} = Pr [tis ao < i), (2-1-4) 
9X, 
Fi-Fr 
(Bae) 2e-n + crt em—a 


where g= 





( ae) 2(¢-—1) + F,,(m—7%) 
Fi-FFr < 

. (Ae 14 Fr fae) 26-1) + Fam— i} 

~ {(Fm— F7)/(1 + Fa )} 2(¢— 1) + (Fn)? (m1) 


The right-hand side of (2-1-4) can be evaluated as the incomplete beta-function J) ;y +.) ($(m—%), $v) 
and so, an approximate value for 7;, can be obtained. 





2-2. We will now consider two illustrative examples. 


Example 1 
Let H, specifyA =A,=1, H, specify A =A, = 16 
with « = 8 = 0-05. 
The continuation region after the mth stage is then 
1+F,, \"" 
ee |) pel < 19 
te < (ts) (Ft) 


The boundary values for F’,, are shown in Table 1. 
12 Biom. 47 
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We will consider the distribution of the d.s.n. when H, is true, i.e. when A = 1. 

In the present case it is possible to compute an approximate value for 4;, (and similarly for 4,,,). 
Hence upper and lower bounds for the cumulative distribution function of the d.s.n. can be computed. 
We will take the mean of these bounds to get a single estimate of the cumulative distribution function of 
the d.s.n. As a check on these estimates, the distribution was also computed by quadrature, forming the 
conditional distributions of F’,,, stage by stage. 

The results of these calculations are shown in Table 2 where the figures in the fourth column are the 
differences of 4(Z,, + U,,). 





Table 1 
m FF, Dad m Pps Ft 
4 0-550 29-303 8 1-941 8-174 
5 1-041 15-393 9 2-135 7-475 
6 1-415 11-346 10 2-300 6-937 
, 7 1-710 9-309 
Table 2 
Cumulative Xp,, 
t A a 
Lower bound Upper bound Approximate Pm by 
m inn On Pm quadrature 
4 0-32066 0-32066 0-32066 0-32066 
5 -52534 -58840 23621 -23793 
6 65933 *77988 *16274 *14465 
7 *74155 -89241 ‘09738 -09275 
8 -80598 -95524 -06365 -06835 
9 *85193 -98126 ‘03598 -04297 
Example 2 
Let H, specifyA=A,=1, H, specify A =A, =17 


with a = # = 0-01. The continuation region is then 





l + F m—-1 
Hm —1) | _____™ 99. 
#6 < (Hy) (; a) - 
The approximate distribution of the d.s.n. and its upper and lower bounds, calculated by the methods 
used in Example 1 are given in Table 3. The figures in the fourth column have been calculated as for 
Table 2. 





Table 3 
Cumulative distribution 
Lower bound Upper bound Approximate 

m Lig, Um Pm 

5 0-15743 0-15743 0:15743 

6 -36758 -38821 -22046 

7 -52071 -56987 -16740 

8 -62224 -70606 -11886 

9 -71138 “79579 -08943 
10 -78089 *87354 -07363 
11 *82917 -91806 -04640 
12 *87299 -95510 -04043 


Quadrature was not applied in this Example; the accuracy of the approximations should be similar 
to that obtained in Example 1. 


2-3. In many cases it is not possible to estimate 7;,,, even approximately, by the methods outlined 
in this paper. For example, this cannot be done for the sequential procedure in a one-way classification 
(by groups), with a fixed number of groups, described by Johnson (1954). (The calculation is possible, 
on the other hand, for a sequential procedure in which the number of groups is increased in the course 


ods 
for 
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of sampling). However, even when the methods used in Examples 1 and 2 are not applicable, it is always 
possible to evaluate the lower bound, P,,, and upper bounds can often be found by ad hoc methods 
(as in Example 3 below). 

2-4. Sequential t-tests. For the sequential t-test, discriminating between two values of the mean of 
a Normal population with unknown variance, described by Rushton (1952), it is not possible to calculate 
Nim: However, an upper bound is provided by the cumulative distribution function of the d.s.n. for the 
standard sequential procedure discriminating between the same two values of the mean when the 
population variance is known. 

Example 3 

Let us suppose that x is a Normal variable with mean @ and standard deviation o; write 6 = (9—0,)/c. 

We want to discriminate between 


H, specifying d= 6,=0 and 4H, specifyngé=6,=1, with «= f= 0-05. 


This can be done by a one-sided sequential t-test with boundary values given by Rushton (1952). 
Table 4 gives lower and upper bounds for the distribution of the d.s.n. calculated as described above. 


Table 4 
Cumulative distribution 


A 





| saiaiaaieas ‘ 
Lower bound Upper bound Approximate 


+ 0-33217 0-36117 0-34667 
6 -52210 -57979 -20432 
8 -64220 *72315 -13172 
10 *73179 -82163 -09400 
15 -86414 *94316 -12694 
20 -93066 -99008 -05672 
25 -96229 1-00000 -02077 


The approximate values of p,, in the last column have been calculated in the same way as in the preceding 
examples. 


The author would like to thank Dr N. L. Johnson for his helpful suggestions during the investigation 
of this problem. 
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On the covariance determinants of moving-average and autoregressive models 


By P. D. FINCH 
Research Techniques Division, London School of Economics 


1. InTKODUCTION 


It has been noted (Durbin, 1959) that the limiting covariance determinant of n successive values of 
the second-order moving-average series (1) 

@, = €, + Bi e,_1 + Poer2, (1) 
where the ¢, have zero mean, unit variance and are uncorrelated is the same as the limiting covariance 
determinant of n successive values of the autoregressive series (2) 

@ + By X11 + Bes = & (2) 
with same coefficients and the same residual variance. 

In this note we show that this result is generally true. We formulate the generalization for discrete 
weakly stationary processes with rational spectral density functions although it will be seen that the 
proof will apply to processes with absolutely continuous spectral distribution functions whose spectral 


densities are reciprocals of each other and satisfy a Lipshitz condition (see formulation of Theorem GS 
below). Explicitly we prove the following theorem. 


THEOREM 1. Let 
A(z) = Gg +4,2+4,27+...4+04,27 (a) = 1), 
Bz) = by + 6,2+ bo27+...4+6,2% (by = 1), 
where 7p, g are integral, the coefficients a;, b; are real and all the roots of A(z) and all those of B(z) lie 
without the unit circle |z| = 1. Let {e,}, {y,} be two independent (real) processes with 
Exe} = E{y} = 0, Efe,e,} = Ef} = 64,2. 
Consider the two weakly stationary (real) processes {z,}, {y,} satisfying respectively the equations 
Ag Xp + Ay Hy t-.- FOX» = bg +O, 6 1 +... + beg (3) 
Ay e+ By Mea + «+ HA y Mey = OoYp +O, Yat --» +OgH-a- (4) 
Let X,, Y, be the covariance determinants of order (n + 1) of the processes {x;}, {y,}. That is 
X,, = det || H{x,x,}|| (s,¢ = 0,1,...,n), 
Y, = det |Ef{y,y}||  (s,¢ = 0,1,...,7). 
Then lim X,, = lim Y,. (5) 


n>o n>o 
Further the common limiting value of these determinants is given explicitly by the following expression 
[BrBa---Ba|M- TTB, —18 pt” = 1,25. (6) 
1 |8,8,—1|M|a,a,—1| (v,» = 1,2,...,p), 


Oy Oe... By 
where ), %,...,@, are the distinct roots of A(z) and /,, /,... 8, are the distinct roots of B(z). 





2. Proor OF THE THEOREM 


The theorem may be deduced very simply from a general theorem on the determinants of Toeplitz 
forms. The general theorem is proved in Grenander & Szego (1958, § 5-5) and for our purpose may be 
formulated as follows. 


THEOREM GS. Let f(A) be positive and continuous in [—7, 7] with f(—7) = f(7). Let f’(A) exist and 
satisfy the Lipshitz condition 


|f’(Ay) —f(Ag)| < K|A,—A,|* (K >0,0<a <1). 


1 fta7 
Let Tf) = =| [tig + uy cfr +...4+u, e@A|2 F(A) dA 
=—7 
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be the nth Toeplitz form associated with f(A) and let D,(f) be the determinant of this form. Then 


pay cof? ffi2 g’(e) |? ae 
ge Re 
ne Gyr gz) | “J” 7) 


. where G(f) = g(0) and g(z) is a function uniquely determined by f(A) satisfying 
1 +7 
(o(0)P = exp | | logs(ayaa}, 
77 === 


\g(e*)|-* = f(A) 
j and the integration on the right of (7) extends over the unit circle |z| < 1 







































































as of It is well known that the processes {2}, {y,} defined by (3) and (4) have absolutely continuous spectral 
b distribution functions with spectral densities (277)-1f,(A), (277) f,(A) given by 
(1) Bie’) |2 
Aj= 
ance FalA) res 
A(eiA) |2 
A)= 
| fi = |B) 
The covariances of the {x;}, {y,} processes are given by 
rete r 7 ct. 
; the E{x,x,} = on | i ef f(A) dA, 
ctral 1 J 
as Biysu) = 5 [ ” e-2Apaydr 
he eens 
? so the covariance determinants X,, Y, are just the determinants of the Toeplitz forms T,,(f,), Tn(f,), 
respectively. 
} To apply Theorem GS we remark that the functions g,(z), g,(z) are given by 
A(z) 
:) lie 9%) = Bo)’ 
_ Biz) 
92) = 30 
and that G(f,) = g,(0) = G(f,) = g,(0) = 1. Thus we have 
r 
siden oe 
(3) lim X,, = exp ( ff ga(2) aa\ ; 
(4) n> n J2(2) 
. 1 Ju(2) |* 
lim Y,, = ex al = ac}. 
a0 “ n 94(2) 
Jez) _—_gy(2) 
But 2) = z)}* hence - 
. 9x(2) = [9y(2)] a oa 
(5) ff and consequently lim X, = lim Y,, which proves (5). 
I n>o n>o 
alee | In order to obtain (6) we evaluate the integral on the right of (7). Note that 
| gaz) _ A(z) Bez) 
(6) | gz) A(z) Biz) 
| Then if A(z) = A(a,—z) (a#_,—2)...(a,—z) and B(z) = B(f,—z) (f,—2) ...(8,—2) we obtain 
a @ ) BA 
Ja(2) p=1h,-2 p=1 ty—2 
Je(2) |* 1 1 
, and 
lita 92(2) pop os Oe a Ba-®) (By- —2) wer ip (ty —2) (%, —2) 
y be = 1 > i.e 
u=1 ot —2) fw=1.. a By Z) (a, —2z)" 
and ‘=1 - p=l1.. 
1 
Noti = a Sans 
sre “1 = enn ~ "le 
—o 1 ff 92(z) "a = [ | A,B) II(a,@y) T1(6,%y) TB yy) 
7 92(2) (8, By — 1) (a,a,,— 1) TN(f,%)— 1) T1(B 0, — 1) 
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whence 
| ... By |%a-”) II |f,%,—1|? 
lim X,, = BiBs--- Bo Buty — (pt, p’ = 1,2,...,93 ¥,” = 1,2,...,p). 
Petey Oy Og... Ly 1 |2,By—1| 1 |e, %,,—1| 





This proves (6). 
This result is easily modified if A(z) or B(z) have multiple roots. 


This paper was written under a grant from the Ford Foundation. 
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A note on some approximations to the variance in discrete-time 
stochastic models for biological systems 


By P. H. LESLIE 
Bureau of Animal Population, Department of Zoological Field Studies, Oxford 


In the original description of a discrete-time stochastic model for studying the properties of certain 
biological systems by numerical methods, (Leslie, 1958), certain empirical approximations to the 
variance o*(N,,,|N,) were suggested in order to save machine time at each step in the calculations. 
Although these approximations appear to work very well in practice (cf. some of the results given in 
Bartlett, Gower & Leslie, 1960), their scope is very limited, and it therefore may be a convenience if 
some more general approximations are available, which should hold over a wider range of possible values 
for the constant birth-rate (b) and the constant death-rate (d). 

Considering only one of the species in a system, and given that this population consists of N, indi- 
viduals at time t, then the expected number of individuals at time ¢+ 1 is 


A 
E{Ni+1 | Nj = -N,=AN, 
% (1) 
and the variance O {N41 | Ni = $: HL | NG, 
in which q, > 1 is some function of the numbers at time ¢ in each of the interacting populations, and the 
constant A=e-4 (b> d). (2) 
When the birth-rate (6) of the particular species is assumed to remain constant (BRC model), we have 


= (2) log, A, = 7, = b—d,, 


2b 
f= (F-1)a-» (A, + 1), (3) 
t 
= 2b (A, = 1), 
and when the death-rate (d) remains constant, (DRC model), 
log,A,=7r.=b,—-d (A2A,> 74). 
2d 
¢= (= +1) (A,-1) (A; #1, A>A,> e~4), 
t 


= 2d (A, = 1). 


In designing a numerical experiment with these models, some restrictions must be placed on the 
values of b and A which may be adopted in (2), because of the difference between discrete-time and con- 
tinuous-time models in regard to the degree of variability in numbers when a system is fluctuating in 
the region of the stationary state. The variances and covariances of a discrete-time model are always 
greater than those of its continuous-time equivalent (a point which was not appreciated in Leslie, 1958), 
but by reducing the unit of time adopted in the discrete-time model, it is always possible to make the 


(4) 


1, 46, 


ornia 


rtain 
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pions. 
en in 
ace if 
alues 
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variances of the distributions for the two types of model more in agreement (Bartlett e¢ al. 1960). 
A convenient set of restrictions in the choice of the parameters in (2) has been found to be 


A=e-4<2-0 (b< 1-0). (5) 
If, in the expression for the variance in (1), we regard ¢, as a function of A,, we have, for example, from (3), 
ag, _ om a 
a” {2b(e" +r, — 1)/r;}—1, 
dg, % rt 
— = b[1--+—-...J-1. 
” aa, ( 3* 12 
Neglecting the terms in r,, r?, etc., then 
apy 
a, ~(b-—1), constant, 
and hence, to a first approximation, we have for the BRC model, 
P ~ (b+1)+(b—1)A, (6) 


since from (3) ¢, = 2b, when A, = 1. 
Similarly, it may be shown from (4) that as a first approximation for the DRC model, 


go, ~ (€d—1)+(d+1) A, (7) 


As an example of these approximations, consider three values of constant b, viz. 1-0, 0-8 and 0-5. The 
first two of these represent the more extreme values of b which might be adopted in the BRC model, 
and are compatible with assumed values of A < 2-0; while in the case of constant 6 = 0-5, any value of 
the constant A must necessarily be less than A ~ 1-65. The following are the exact and approximate 
values of ¢, over a range of values of A,. 








Exact ¢,; Approximate ¢; 

e, A Y aad “s a9 
At b=1-0 b=0°8 b=0°5 b=1-0 b=0°8 b=0-5 
1-9 1-904 1-344 — 2-0 1-42 a 
1-7 1-938 1-411 —_— 2-0 1-46 — 
1-5 1-966 1-473 0-733 2-0 1-50 0-75 
1-3 1-987 1-530 0-843 2-0 1-54 0-85 
3-3 1-998 1-579 0-949 2-0 1-58 0-95 
1-0 2-000 1-600 1-000 2-0 1-60 1-00 
0-9 1-998 1-618 1-049 2-0 1-62 1-05 
0-8 1-993 1-634 1-096 2-0 1-64 1-10 
0-7 1-982 1-646 1-141 2-0 1-66 1-15 
0-6 1-966 1-653 1-183 2-0 1-68 1-20 
0-5 1-943 1-654 —- 2-0 1-70 -- 


It will be seen that these approximations are just as good as those originally suggested in Leslie (1958), 
while for smaller values of the constant birth-rate (b) they become even better. Similarly, for assumed 
values of the constant death-rate (d) less than d = 0-6, say, it was found that the use of (7) gave a good 
approximation to the exact values of ¢, for variable A,, the agreement again becoming better as the 
values of d became smaller. 

These approximations for the BRC and DRC models originally arose in considering the behaviour of 
o*{N;,1 | N,} for a logistic process in the neighbourhood of the stationary state (Bartlett e¢ al. 1960, 
§§3 and 11); but it appears now that they will also hold over the entire development of any process, 
provided the restrictions made in (5) in regard to the parameters A and b are observed. Their use results 
in a considerable saving of machine time at each step in the calculations for two or more interacting 
species, since the necessity of including a log subroutine in the programmes is avoided. 
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On two queues in parailel 


By COLERIDGE A. WILKINS 
University of New South Wales 


In a paper published in this journal, Haight (1958) investigated a two-queue system in which an 
arrival is assigned to the shorter queue, or if they are of equal length, to a particular one, the ‘near’ 
queue. 

The purpose of this note is to show that Dr Haight’s results can be extended to the more general case 
where, if X is the length of the near queue, Y that of the other, and W(X, Y) is the probability of an 
arrival’s joining the near one, we have 


1 x<y, 
W(x,y)=4u(z7) xv=y, 
0 z>y. 


The new assumption does not affect the expressions for A(s, s’), B(e, 8’), or C(s, 8’), but does change the 
values of the functions I(x, y) and J(x, y). The modified values of the latter two are tabled below 


I(x,y) J (x,y) 
1 0 x<y-l 
1 1—w(zx) = x(2) xz=y-1 
1 1 tay 
w(x) 1 x=yt+l 
0 1 z>ytl 
We define M, by M, = XX) Pee 


D(s, 8’) is now given by 


o y «o o fs co 
D(s,8’) =A SY DY Ver, y878'%+ DY (1-M,)a1s89+ YY Payrste”4 EM aere'*h, 
1 0 0 


y=lz= z=ly=1 


We denote this more general form of D by G, and its value when w/z) is constant and equal to one, by H. 
If we formally subtract, we find 


i] 
G—H =A M,8*s’*(s’—s) = AF, say. 
0 
The generating function equation for the general case is therefore 
, , 11l—s an 1 oe ' 
(1s) +(s'—8) f= 2 — [dpa N+ —— 1b —P'ale)l+ F. 


The values of F and its first and second derivatives at the points considered by Haight follow. 


s=es’'=0 ezi,e=0 e=0,0’ =] e=e’=] 
F 0 —-M, M, 0 
© 
F, —M, -M, M,—M, ~2M, 
a 
Fy M, M,-M, M, aM, 
@ 
F,, 0 0 2(M,—M;) —2> kM, 
0 
@ 
Pyy 0 2(M,—M,) 0 2> 4M, 
0 
Fy 0 —2M, 2M, 0 


> the 


y H. 
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Haight’s equations (16)—(25), (28), (29), (31), (33), (35) and (36) become 


—Ap’ + Lp 1+ Apgw(0) = 0, (16) 
—Ap t+ MP1. + Apo X(0) = 0, (17) 
—APo— HP10 + EF’ Pir + MP 29 + APoo (0) = 0, (18) 
—Ap’mg + f'P 1M, — PU + MP + APyow(0) = 0, (19) 
—APy.— Py. + MP2, FAP + AP + A(X(L) Pir — X(O) Poo) = 9, (20) 
eo 
AN = wl —P)+ AZ XE) Paws (21) 
—Apou—H' Put’ Port HP +AX(O) Poo = 9, (22) 
—Ap s—h'D + h'P a+ Ap’ —Aw(0) Poo + Apo + AW(1) Pi, = 0, (23) 
—Apmo — LP + LP + HP1.M4 +AX(O) Poo = 0, (24) 
@ 
AF = p’'(1—p’) ADAH) Pew (25) 
—Ape.—MPe.+ MPs. +APy.—APio + AP29 + AP a1 + A(X (2) Poe — X(1) Par) = 9, (28) 
«o 
D,,(1, 1) = Aw + 2um — 2+ 2up — 2A DY kX k) Pres (29) 
0 
APy2— B' Pig t+ HP 3 +AP 1 — APor — AW(1) Pus + APos + AVi2 + AW(2) Yoo = O, (31) 
D,A1, 1) =~ Aw’ + 2m’p’ — 2p’! + 2h'p’ + 2A D kX(k) Pers (33) 
0 
—(A—p’) p my + f'P .2Me— UP 1+ hPa +A( Por + 2w(1) pi + p’mo) = O, (35) 
— (A=) 1,4 + P23 — WP, + H'Pro + A( Pio + 2X(1) Pus + pm) = O. (36) 


Equations (15), (26), (27), (30), (32) and (34) represent relations that are invariant with respect to 
w(x). The expression for D,,(1, 1) is also independent of w(z). 

From the form of the general generating function equation, it is obvious that any relation between 
A, #, p’, and p,, found by differentiating twice or more with respect to one of the pair s,s’ and setting 
the other equal to zero is unchanged by altering w(x). 

The farther modifications that must be made are left to the reader. 


The author wishes to thank Dr Haight for help received first as his student and later as his colleague. 
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The characteristic function of Hermitian quadratic forms in complex 
normal variables 


By G. L. TURIN 
Hughes Research Laboratories, Culver City, California 


INTRODUCTION 


Wooding (1956) has given an expression for the multivariate distribution of several complex normal 
variables of a special type often encountered in the analysis of random noise. In this same connexion, 
a problem of great interest is the determination of the distribution of real-valued (hence, Hermitian) 
quadratic forms in these variables. Such a distribution arises, for example, in the analysis of the per- 
formance of a radar receiver with a square-law detector and a post-detection filter, for the output of 
such a receiver is representable as a weighted sum of the squared moduli of complex normal variables 
of this special type (cf. Kac & Siegert, 1947). Again, the distribution arises in the computation of the 
probability that an ideal binary hypothesis-testing communications receiver will err in its decision 
concerning which of two signals was transmitted through a channel comprising both a random trans- 
mission medium, such as the ionosphere, and additive thermal noise (Turin, 1958, 1959). 








200 Miscellanea 


A first step in the determination of the distribution of the quadratic form, that of obtaining its 
characteristic function, is easily accomplished through the use of Wooding’s result. The distribution itself 
may then be obtained by Fourier transformation of the characteristic function. 

Although some of the results presented here have been derived elsewhere by use of a real-variable 
approach, it is felt that the complex-variable formulation, which arises so naturally in noise analysis, 
will be of interest in its own right, as well as in providing a simplified derivation of the characteristic 
function. 


DERIVATION OF THE CHARACTERISTIC FUNCTION 


Let v,, (n = 1,...,.N) be a set of complex random variables, the real and imaginary parts of which, 
Las Yn» are normally distributed and satisfy the relationships 


E[(%m—Zm) (%_n—2n)) = El(Ym—Ym) (Yn—Jn)] (1) 
and E(%m— Xm) (Yn—Yn)] = — E{(%,—Zp) (Ym—Ym)]> (2) 
where Z,, = E(x,) andy, = E(y,). Denote by V the column matrix formed from the v,, and let V = E(V) 
be the matrix of the (complex) means and L = E[(V — V)(V — V)’*] the (complex) covariance matrix, 
assumed non-singular. Further, let Q be any Hermitian matrix. Then it is found that the characteristic 
function of the (real) quadratic form f=V*QV (3) 
is given by A(t) = |I —#LQ|- exp {-— V’*L—[I —(I-#tLQ)—] J}, (4a) 


where J is the unit matrix. 

The quadratic form in the exponent of (4a) may be diagonalized by use of the transformation 
V = U,M-U,,D, where U, is the normalized modal matrix of L-1, M* = U;*L-U,, and U, is the 
normalized modal matrix of M-1U3,*QU, M-". Then, alternatively, 


(es \d,,|? 
1—ita, 
t)= _—_—_—_OC 
w= Tl aay 
where the d,, are the elements of D, and the A,, are the eigenvalues of LQ (cf. Kac & Siegert, 1947). 
To prove (4a), one first notes that the characteristic function of the random variable f is defined as 


2 
® 


, (45) 


A(t) = He’) = | p(V) exp (itV’*QV) dV, (5) 
vV 
where p( V) is the multivariate density distribution of the N complex normal variables v,, and 
N 
dV = [| dz,,dy,. 
1 


Wooding’s expression for p(V), modified to include explicitly the case for which E(v,) + 0 (and thus 
including, in noise analysis, the case of a non-random or ‘signal’ component), is 


p(V) = a-N |L|“*exp[—(V—V)’* L-(V—DV)}, (6) 
the exponent of which may be written 
—V*L2 0 —- V"*LV 4+ 0*LV + 0*L 7. (7) 


Since the covariance matrix L is clearly Hermitian, L-! is Hermitian, and (Z-1)’ = (Z-1)*. Hence it 
follows that V*L-V =(V(L-)* vy 
= («Ly v*)* 


= [V’*(L-1) V}*, (8) 
and (7) becomes 
—V"*L3V —- V'*L3V +2Re(V*L-7). (9) 
Using (6) and (9), (5) becomes 
P(t) = 7-N|L|“ exp(— piv) [ exp [2 Re(V’*L-V) — V’*(L-1—itQ) V] dV. (10) 
V 


Now, it is easy to show by a generalization of a proof by Cramér (1951, pp. 118-20) that if A and V are 
any (N-element) column matrices and B and C are (N x N) Hermitian matrices, 


| exp [Re(A’*V) — V’*(B+iC) V]dV = 1% |B+iC| exp[}.4’*(B+iC)— A], (11) 
v 


provided that B is positive definite. Since L is a non-singular covariance matrix, it is positive definite 


thus 


(6) 


(7) 


1ce it 


(8) 


(9) 


(10) 


V are 


(11) 


finite 
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(Cramér, 1951, p. 295); hence, L~ is positive definite, and (11) may be applied to (10). The integral in 
(10) then becomes aN | L-—itQ|-1 exp[V’*L-(L-1 —itQ)- L-1 V7}, (12) 
which, inserted into (10), easily yields (4a). 

Equation (46) may alternatively be derived from an intermediate step in the proof of (11), which 
involves the simultaneous diagonalization of B and C (i.e. L-1 and Q). Equivalently, as a referee has 


noted, this diagonalization is to be regarded as reducing the form f of (3) to a linear combination of 
independent, non-central x? variables of two degrees of freedom. 


EVALUATION OF THE DENSITY DISTRIBUTION 
The density distribution of f, 


oe 
Pf) = 5 i) __ Heat, (13) 


is generally difficult to obtain precisely, because of the complicated nature of the exponential factor 
in ¢(t). However, in the special case in which the variables v, have zero means (corresponding, in noise 
problems, to the absence of a signal component), V = 0, and (4a) and (46) reduce to 
N 

o(t) = |I-#LQ|> = I] (1—ita,). (14) 
This is the reciprocal of an Nth degree polynomial in (it), the zeros of which lie at the reciprocals of the 
eigenvalues of the matrix LQ. The singularities of $(¢) in this case hence consist solely in a finite number 
of finite-order poles; once the eigenvalues of LQ are found, p(/) is therefore readily determined through 
the use of the residue theorem. 

It is interesting to note that this appearance of poles in ¢(t) when V = 0 occurs because of the special 
constraints placed on the 2N real random variables x,, y, by (1) and (2). It has been shown by Turin 
(1956) that if one considers more generally a quadratic form of unconstrained real variables Z with 
covariance matrix M, one obtains the characteristic function 


o(t) = |I-— 2M R|-+ exp { -42Z’M-[I — (1 — 2M R)-] Z}, (15) 


where R is the (real) matrix of the quadratic form; this expression may of course be put into a diagonal 
form similar to that of (46). Clearly, in the special case when the real variables Z are constrained in the 
fashion prescribed by (1) and (2), the eigenvalues of //R must be of even multiplicity so that the neces- 
sary equality of (15) and (4) will obtain. In general, however, such a pairing of eigenvalues does not 
occur. Hence, when Z = 0 in (15) (ef. Whittle, 1951), 4(t) typically displays branch points among its 
singularities, rather than poles only, and the evaluation of p(f) from ¢(t) becomes correspondingly more 
difficult (see Grenander, Pollak & Slepian, 1958). 
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Some numerical results for waiting times in the queue £,/M/1 


By C. BURROWS 
University College London 


Kendall (1953) gave several equilibrium results for the queue represented by GI/M/s (i.e. general 
independent input, exponential service-time distribution, s (> 1) servers) with the simple ‘first come, 
first served’ queue discipline. It is found that the equilibrium distribution of queue length is geometric 
except for modifications to the first (s— 1) terms, the common ratio for the geometric part of the series 
being A which satisfies the equation 


A= [eemmoaatn (0<A<1). (1) 
0 


In this expression A(w) is the cumulative distribution of the inter-arrival times adjusted so as to have 
a mean of unity, and p is the traffic density, i.e. the ratio of the mean service time divided by s, to the 
mean inter-arrival time. 

In this note some numerical results are tabulated for the situation where the inter-arrival distribution 
is of the y? form with v degrees of freedom, so that 


1 
T'(}v) 
If we take v = 2k then, when k = 1, dA(u) is an exponential distribution and as k tends to oo, dA(u) 


approaches a deterministic form. Thus in Kendall’s notation the system is of the form H,/M/s. 
Equation (1) gives 


dA(u) = (4v)” udv-1 e-tur dy, 





r Y ae kk _— kd 
= e-A-A)ulp KE —__. yk e—uk dy 
I, Pk) 


oe eM 
7 pttt—r ’ 


This equation may be solved by iteration and values of A are given in Table 1 for values of k and p. 


Table 1. Values of A, the queue parameter 
p 


k-1 0-05 0-1 0-2 0-3 0-4 0-5 0-6 0-7 0-8 0-9 


0-0500 90-1000 0-2000 0-3000 0-4000 0-5000 0-6000 0-7000 0-8000 0-9000 

“5 -0122 -0504 +1298 ‘2218 +3215 -4267 -5359 -6484 *7635 *8808 
07839 ©-0292 -0938 -1789 *2753 -3820 -4958 *6155 -7399 *8682 
-0?223 = -0127 -0590 -1319 +2239 *3305 -4485 -5759 -7109 *8526 
‘0°774 =«-07680 = -0430 -1083 +1963 -3019 -4216 +5529 -6939 8433 


0 
0 
1 
2 
3 
4 0-0°321 0-07417 0-0341 0:0942 0-1792 0-2838 0-4043 0-5379  0-6828  0-8372 
5 ‘07151 =-0?281 = -0285 “0849 *1675 -2712 +3922 -5274 -6749 *8328 
10 
15 
20 





‘04112 = -08818 = -0172 -0642 -1405 +2414 -3629 -5017 *6553 +8229 
-0°5232 = -0°424 = -0136 ‘0567 +1302 -2292 +3512 -4913 *6474 “8175 
-0°791 = -0°281 = -0118 -0528 +1247 +2235 +3450 *4857 -6432 “8151 


9) 0:07206 0-01454 0-07697 0-0409 0-1074 0-2032 0-3242 0-4670 0-6286 0-8069 


The values in the row k—1 = © correspond to Table 2 in Kendall (1953, p. 351). The table shows 
that as k increases and the inter-arrival time distribution departs more and more from the exponential, 
A decreases so that the queue length distribution has decreased probabilities for its tail values. 

The distribution of waiting time (w) (i.e. the time spent in the queue up te the start of service) is 
exponential except for a probability concentration at w = 0, and for the case where we have only one 
server (s = 1) Kendall (1953) gives for the expected waiting time 

_ Ew) A 


. < «b=a" 


re’, sa wa 


oaeefre = —& 


‘u) 


or aero @tnw wont & 


ws 
ial, 
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Here R is the ratio of expected time spent waiting for service to the expectation of the time spent in 
service (v). This may be regarded as a figure of demerit for the system. When the inter-arrival time 
distribution is of the exponential form we have 


p 
Ru = >—, 
m=] or 
and in Table 2 values of R/Ry are tabulated. 


Table 2. Values of R/R,, 
p 


k-1 0-05 0-1 0-2 0-3 0-4 0-5 0-6 0-7 0-8 0-9 R’/Ry 


0-5 0-2347 0-4775 0-5968 0-6650 0-7107 0-7443 0-7698 0-7903 0-8071 0-8210 0-8333 
1 -1608 -2705 -4140 -5052 -5699 -6181 -6555 -6861 -7112 -7319 -7500 
0425 -1154 +2509 -3544 -4327 -4937 -5421 -5820 -6148 -6427 -6667 
3 0147) «0615 =-1796)—_ +2835 )~— ss -3663 = «4325S «4859 = «5300 = 5667 = -5978 = - 6250 


4 0-07609 0-0377 0-1410 0-2427 0:3275 0-3963 0-4525 0-4989 0-5382 0-5714 0-6000 
5 -07287 «=-0254 «86-1172. --2165 = -3018 = -3721=Ss «4302 = 4783) = -5190 = -5534 = 5833 
10 -09214 -0?737 «=-0700 =-1600) = -2452,— ss ©3182. 3797) Ss «4315 = «4753 = 5163-5455 
15 “07440 «-09382 «-0552.-1401 4=s -2246-S -2974 = 3609 )=—s «4139 = -4590 )§= -4977 ~=— 5313 
20 ‘04150 =-0?253. «-0479 = 1300) = --2137:s -2878 «= «3511 = «40470 = «4507 = «4898 3=—s 5238 


co }§=—- 00 392_-0-0°409 0-0281 0-0995 0-1805 0-2550 90-3198 0-3755 0-4231 0-4643 0-5000 
Ry 0-0526 0-1111 0-2500 0-4286 0-6667 1-0000 1-5000 2-3333 4-0000 9-0000 





bo 


We may compare these results with those obtained when the inter-arrival time distribution is ex- 
ponential, while the service-time distribution is not necessarily so. From Kendall (1951) we have 


....: T v 
v= 70)" ml +e(5) | 


where var (v) is the variance of the service-time distribution which has mean b. R’ is a minimum when 
var (v) is zero, i.e. all service times are of equal length, and then R’ is half of the value Ry. The ratio 
R’/Ry is independent of p and in the extreme right-hand column of Table 2 values of the ratio are given 
for the case where the service-time distribution is of the y? form with 2k degrees of freedom. 

By comparing the body of Table 2 giving the ratio R/Ry with the right-hand column giving R’/Ry 
we see that a change in arrival distribution can bring about a larger saving in waiting time than a change 
in service-time distribution. This is especially true when the traffic density is small. 
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A formula for the curvature of the likelihood surface of a sample drawn from 
a distribution admitting sufficient statistics 


By B. RAJA RAO 
Department of Mathematics and Statistics, University of Poona, India 


1. SuMMARY 


In this paper is given a formula for the curvature of the ‘likelihood surface’ (see § 3) of a sample drawn 
from a distribution admitting sufficient statistics for the parameters, at the point represented by the 
maximum likelihood estimates (m.l.e.) of the parameters. More precisely it is shown in the case of two 
parameters that the negative of the trace and the determinant of Fisher’s information matrix, respec- 
tively, measure the first and second (Gaussian) curvatures of the ‘likelihood surface’ at the point repre- 
sented by the m.Le.’s of the parameters. In the case of m parameters an expression for the Riemann 
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curvature of the ‘likelihood hypersurface’ (see § 4) of a sample at the point represented by the m.l.e.’s 
is obtained in terms of the elements of Fisher’s information matrix. An expression for Riemann’s 
curvature invariant is also obtained. A large sample interpretation is given. 
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2. INTRODUCTION 


As Durbin & Kendall (1951) remark, apart from the use of a multi-dimensional space by Neyman and 
Pearson in connexion with fitting theoretical distributions, by Hotelling and Fisher in a discussion of 
maximum likelihood estimators, and apart from a remark by Huzurbazar (1949) concerning the radius 
of curvature of a likelihood curve, there seem to have been few attempts to consider estimation as a 
geometrical problem. In the following paper we shall make use of differential and Riemannian geometries 
to obtain the curvatures of a ‘likelihood surface’ and a ‘likelihood hypersurface’. 

Let f(x, 6,;) be the probability density function (p.d.f.) of a distribution involving a set of m unknown 
parameters. Under certain general regularity conditions, Koopman (1936) has shown that the necessary 
and sufficient condition that f(x,0;) should admit a set of m jointly sufficient statistics is that f(«, 0;) 
be in the form 


m 
fla 0) = exp $: wy(0,) v4) +A(a) +B) (a) 
y= 
If x,,%,...,2, is a random sample of n independent observations from this distribution, we shall for 


n 
convenience describe L = § log f(x;,0,;) asthe log-likelihood function of the sample. Let 6;,7 = 1,2, ...,m 
i=1 


be a solution of the system of likelihood equations 


so that the 6; are the m.l.e.’s of the 0;. Then Huzurbazar (1949) has shown that the above system has 
@ unique consistent solution 6; = 6;, j = 1,2,...,m for all samples of any size, that the solution 0; = 6, 
makes the likelihood an absolute maximum and has obtained the interesting property 


eL ( 8b - 
| ( nh a4 “ aah] = Und, ?) 


where [/,,] is Fisher’s information matrix and the circumflex (“ ) indicates that the value I,, is evaluated 
at 0, = 0,. 

In this paper we shall obtain a formula for the first and second curvatures of the ‘likelihood surface’ 
in the case of two parameters and a formula for the Riemann curvature of the ‘likelihood hypersurface’ 
represented by the likelihood function of the sample at the point given by the m.l1.e.’s of the 9,. 


3. CASE OF TWO PARAMETERS 


For the sake of simplicity we shall first consider the case of two parameters. Let 2, %2,...,%, be a 
random sample of size n from a distribution of Koopman’s form (1) depending on the two parameters 
6,, 0,. If we consider a fixed sample and variation in the parameter space, the locus described by 
L = L(0,,0,), in which one co-ordinate is written as a function of the others, will represent a surface in 
the parameter space, which we might call the ‘likelihood surface’ of the sample, whose equation is now 
said to be in Monge’s form. 

One interesting question which immediately arises is why should we consider samples drawn from 
only distributions admitting sufficient statistics? Why should not we sample from any distribution? 
First of all, we need regular conditions and our results will not hold when the range of the distribution 
depends on the parameters to be estimated. Secondly, the metric L for a distribution admitting sufficient 
statistics is itself an interesting one and it is quite well known that this form of the metric is almost 
inevitable in measuring the ‘distance’ between two populations. The role of sufficient statistics seems to 
depend on the fact that 21/20 = function of # and the sufficient statistics. Further, in our treatment we 
require Huzurbazar’s result, given by equation (2), that at the m.l.e. point 0 = 6 (to take the instance 
of a single parameter) the value of é*L/20? when 0 = 6@ is the same as its expected value when 0 = 8, 
which is true only when there is a sufficient statistic for 0. Such a point 0 = 6 on the likelihood surface 
is known to be unique. In the case of a general distribution this is not possible. 

We shall now obtain the first and the second (Gaussian) curvatures of the likelihood surface at the 
point (8,,4,) represented by the m.l.e.’s of 0, and 0,. 


we 
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In the usual notation (Weatherburn, 1927, p. 75) we have 


_ aL 
2, 

mE tomy 
1 20, 


at, and only at, (9,,0,) as shown by Huzurbazar. Also at the point (6,,6,) we have 
E=1, F=0, G=1, H?=HG-F*=1 
(note that the parametric curves 6, = constant and 0, = constant are orthogonal at (4,,6,)), 
e= @L/06?, f= 0L/00,00,, g = eL/263. (3) 


(We have used e, f, g for the second fundamental magnitudes instead of L, M, N in Weatherburn’s 
notation to avoid confusion, for L is already used to denote the log-likelihood function of the sample.) 


Note that by formula (2) 2L 
= eb = {B{ +—— 
= SE con 
=0 
if 0, and 6, are orthogonal (Jeffreys, 1948), so that in such a case the parametric curves at (4,,0,) are 
also the lines of curvature. Thus at (6, ,) 


T? = eg—f? 
= (81/262) (22L/063) — (22/00, 20,)? 
> | —@L/2b} — 21/20, 00, 
~ | —8L/00,00, —0L/063 
= det Tu rd (again by formula (2)) 
f 12 I 22 
= det [/,,], (4) 


say, where [J,,] is Fisher’s information matrix. Then we have at the point (6,,9,): 


the first curvature J = (Hg—2Ff+Ge)/H? 

= 81/00} + @L/eb3 

se 14, 6.) + T,,(4,, 6,)} 

= —trace[f,,], 
and the second curvature K = T?/H? 

= det [f,,], 
from equation (4). Thus the negative of the trace of the information matrix and the determinant of the 
information matrix respectively measure the first and the second curvatures of the likelihood surface 
represented by the likelihood function at (6,,6,). Or in other words we note that at the point (9, 0,) 
the curvatures —J and K are respectively the sum and product of the latent roots of the information 
matrix. 

For large samples we have another interpretation. The variance-covariance matrix of 8, and 6, is 
then [I,,(9;,9,)}-, the estimate of the determinant of which is {det [7,,]}-! = 1/K. Thus the reciprocal of 
the Gaussian curvature of the likelihood surface at the point (9,,4,) gives an estimate of the generalized 
variance of the m.l.e.’s in large samples. 


4. CASE OF SEVERAL PARAMETERS 


Let us now consider the general case of m parameters 0, 42, ...,9,. Let S_4 1 be a Euclidean space of 
m+ 1 dimensions referred to the co-ordinates ¢,, a = 1,2,...,m+1, and having the metric a,,d¢,d¢,. 
Points of S,,, whose co-ordinates are expressible as functions of m independent variables 0,,7 = 1, 2,...,m 
are said to constitute a V,, immersed in S,,,,. The space V,, is then said to be a hypersurface. Let the 
metric for the V,, be denoted by g;;d0,d0;. Then the coefficients of the two forms are connected by the 
relations 


(5) 
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Consider now the particular hypersurface defined by 


?: =, 
$. = 9,, | 
: : (6) 
Pm = Om 
Pm+i = L(G,, 4s, ee eS | 


where L(,, 62, ...,9m) is the log-likelihood function of the sample. We might call it the ‘likelihood 
hypersurface’ of the sample. In such a case we achieve a simplification by choosing the co-ordinates 
¢, in S,4, a8 Euclidean co-ordinates, so that 

Gy, =1, Ayg=0, & + 2: 
then the coefficients g,;; of the fundamental form for V,, are given by 


ii ap, ap. a = a9, a9, Obm+1 2bm+1 








94 30, 00, M2, 00, 80, 00, 88, 
OL eL 
= Fe +— 
Pos : 20; 08; 
Therefore we have the metric 
te (¢ +9) 
98 = 30,00, “ *7” 
(7) 
ave a) 
Iu = 20, . 
Let now (4, 6, Pee 8.) be a solution of the system of likelihood equations 
a 0 = 1,2 8 
0.7 (v = 1, 2,...,m). (8) 


v 


Then at the point (4,,6,,...,4,,) the metric g,; becomes, say 9;;, where 
94=0 ee 


(9) 
Ju = 1, 
so that the reciprocal tensor 9" is 
$=0 (6+), 
9 (it + J),) (10) 
9 = 1. J 


We shall now obtain an expression for the Riemann curvature of the likelihood hypersurface V,, at the 
point (4,,0,,...,8,,) for the orientation determined by two unit vectors A‘ and Bi. This is defined by 
(see Weatherburn, 1950, p. 117) 


eee Ry jnp AA" BIB? 
(Grn9in—QrnGin) AA" BB?’ 


where the quantities R,;,, are the components of Riemann’s curvature tensor and are defined by 


R = > {oN — (oh oo aor + { 29r1 l i {, l 
eT a Gri jp jp 20, 20, (a. oa + Be 30, *** ns eo Gri ps jn , 


k 
where [k,7j] (see below) and (‘| are Christoffel symbols of the first and second kinds respectively 





(11) 











defined b 
; [k, 47] _ 129s Gin 9s 
00 8 eat * Get — dct 
k 
and = gTh. ii). 
("| g*[h, ij] 
Thus we may write 
Ryjnp = T1+T (12) 
where = 1 OGr> Pg in _ O7Orn - sr) (13) 
"*" 2\00,20, 00,20, 20,20, 20,20, 


and T, = g"*([jn, 8] (rp, t]—[Jp, 8] [rn, t)}). (14) 
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ee 
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Using the metric defined by the equations (7) we obtain 
eL @L eb @L 








a _ 15 
+” 20,80, 80;00, 80,00, 80;00,, (15) 
oL aL 
d T, = —g*——T,. 16 
sh =~ 9"56, 00," (16) 
Evaluating the various quantities at the point (9, 4,, ...,9,,) we find, since ( T’,)9,=0, = 9, 
R eL &L eL @&L 
Ryiny = 


26, 06,,00,00, 20,00, 00,00, 
(— Lyn) (—L4o) — (—L, 9) (— Lin) 
fin ty 
a 
by formula (2). Using the formula (11) and choosing the vectors A‘ and B/ as orthogonal (so that the 


denominator of K is unity) we find that the Riemann curvature of the likelihood hypersurface V,, at 
the point (6,, 6,, ...,8,,) associated with the orthogonal unit vectors A‘ and B’ is given by 
: ae 
K= > | fp. |ATAtBOBP. (17) 
in 


r,j,n,p ip 








We are now faced by the problem: how many (distinct) non-zero terms are there in the above quadruple 
series? It is easy to see that it is the same as the number of distinct arithmetical non-vanishing com- 
ponents of the tensor R,;,,,, and it can be shown that the required number is -13m?(m?— 1). 

Since the Riemann curvature K determined by (17) is associated with the particular orientation defined 
by the vectors A‘ and B’, it is convenient to consider Riemann’s curvature invariant R, which is in- 
dependent of the particular orientation. It is defined by 


R= gg" Ry ings 
so that at the point (4,,9,,...,9,,) it gives 


R= 97°79" Brin 
= Rennp from (10) 








aed Bin i, 
" Bas 
= = S | an 


nen Bow | : 


Thus the sum of the determinants of all the possible 4m(m— 1) leading 2 x 2 submatrices of the infor- 
mation matrix measures the negative of Riemann curvature invariant FR at the point (0,585, ---s Om) 
represented by the m.1.e.’s of 6,, 42, ...,9m. AS in the case of two parameters we note that the Riemann 
curvature invariant at the point (4,,6,, ...,6,,) is the negative of the sum of the products of the latent 
roots of Fisher’s information matrix taken two at a time. 

As in §3 we could give a large sample interpretation of our result. 


I wish to record my great indebtedness to Prof. V. S. Huzurbazar for his constant encouragement 
during the course of this investigation, and to Dr (Mrs) M. Apte for some helpful discussions I have had, 
which have benefited me greatly in § 4. My grateful thanks are due to Prof. M. G. Kendall for his valuable 
suggestions and comments on an earlier draft of this paper which have added to its value. 


REFERENCES 


Doursin, J. & KENDALL, M. G. (1951). Biometrika, 38, 150. 

HuzurBazar, V. 8S. (1949). Biometrika, 36, 71. 

JEFFREYS, H. (1948). Theory of Probability (2nd ed.). Oxford: Clarendon Press. 
Koopman, B. O. (1936). Trans. Amer. Math. Soc. 39, 399. 

WEATHERBURN, C. E. (1927). Differential Geometry. Cambridge University Press. 
WEATHERBURN, C. E. (1950). Riemannian Geometry. Cambridge University Press. 


13 Biom. 47 





— 


—EEEE 








Reviews 


Statistics of Extremes. By E. J. GumpeL. New York: Columbia University Press. 1958. 
Pp. xx +373. $15.00. 


A saying, to be found equivalently in many languages besides English, states that ‘a chain is 
as strong as its weakest link’. This aphorism states succinctly and accurately why statisticians, 
and especially perhaps statisticians advising engineers, are concerned with extreme value problems. 
E. J. Gumbel has been concerned for many years with the study of these and allied problems and has, 
in this present book, put down the fruits of his researches. The book, written by a specialist, is therefore 
one which would a priort command attention, and the contents discover to us that this attention is 
very much worth while. 

Prof. Gumbel requires of his readers that they should have as prerequisites some calculus, some 
analysis and a little more than a smattering of mathematical statistics and probability. As delineated 
by the author the mathematical structure of the book is not difficult but the uninitiated will not be able 
to plunge into it in a carefree way. The book begins with a longish chapter (38 pages) in which the 
author sets out the tools he is going to use and seeks to orientate his readers so that they understand 
the blue-print from which he is going to work. Chapter 2 is on order statistics and Chapter 3 on the 
exact distribution of extremes. As is well known the study of exact distributions in any form of 
‘order’ problem is, on the whole, an unprofitable proceeding sc we turn in Chapter 4 to the analytic 
study of extremes, with asymptotic results following in Chapters 5, 6 and 7. The range has Chapter 8 to 
itself. There are 44 tables and over 100 graphs. 

The emphasis, as far as the author’s examples go, is on mechanical engineering problems (such as 
fatigue in metals) and on climatic difficulties which are really civil engineering problems (such as floods 
and droughts). But it would be a mistake to conclude that only engineers with a statistical slant could 
profitably read and learn. Anyone who is interested in extreme values will have to read this treatise 
in order to find out what has been done. In the limited field of mathematical statistics which this 
book delineates it is encyclopaedic and can be unhesitatingly and unreservedly recommended. 


F. N. DAVID 


International Journal of Abstracts: Statistical Theory and Method. Vol. 1, No. 1, ed. 
W. R. Buckxianp. London and Edinburgh: Oliver and Boyd for the International 
Statistical Institute. 1959. Subscription £5 or $16 per annum 


This new journal, to be issued quarterly, is a compilation of straight abstracts, of about 300 words, 
without comment. There are 160 abstracts in this number, printed two to a page (on one side only, so 
that they can be cut out and stuck on index cards if desired). They vary in length and precision 
depending to some extent on the length and originality of the papers abstracted and also on the degree 
to which the divers abstractors have digested what they have read. Since the whole relevant literature 
(nearly 400 journals) is scanned the price of the journal is reasonable and it provides, in abstract form 
at least, translations of Russian and Japanese papers for instance which are otherwise effectively 
unobtainable. Thus, for the first time, it will be possible for most statisticians to feel reasonably sure of 
being aware, at least in outline, of all recent work. 

There is no table of contents nor index (but a running index is to be initiated at the end of the 
volume); here, the abstracts are classified into eleven groups of nine subsections, the papers being 
arranged alphabetically, according to author, in each group. Since many of the abstracts are highly 
condensed the effect of misprints is much more damaging than in other texts and it would have been 
worthwhile to take more trouble with the proof reading even at the price of delay and extra cost. 
Taken all in all, however, it is a product of remarkably high quality and we may count ourseives very 


well served if we continue to see its like. oo 
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Testing Statistical Hypothesis. By E. L. Leamann. New York: John Wiley and 
Sons Inc.; London: Chapman and Hall. 1959. Pp. 369. 88s. 


An outline of this book appeared in 1949, and in the years following parts of it were presented in 
lectures by the author. The whole field of testing statistical hypotheses is not covered in this volume; 
the main concern being the two-decision problem as it presents itself after the experiment has been 
set up. Other areas not fully explored are essentially the chi-square and likelihood ratio tests. 

The first two chapters are introductory, the first introducing in detail the general decision problem, 
based on loss and risk considerations ; in the second the background of probability as a part of the theory 
of measure and integration is outlined. This general decision approach is as emphasized, only to serve 
as basis, for throughout the remainder of the book the Neyman—Pearson formulation with the simpler 
notion of error instead of loss is used. 

The next four chapters may be considered as representative of the book. With the mathematical 
framework of maximizing an integral subject to certain side conditions or, more specifically, based on 
the Neyman-—Pearson fundamental lemmas, an extensive discussion of the theory of uniformly most 
powerful tests is given. Likewise the concept of uniformly most powerful tests is discussed in connexion 
with the class of unbiased tests, and invariance tests. This developed theory is then applied to various 
statistical problems, special attention being paid to the important exponential family and the closely 
related problem of confidence bounds; it is also shown that various existing tests have optimum 
properties within certain classes. 

In the remaining two chapters the theory and underlying ideas of analysis of variance, chi-square 
tests, likelihood ratio tests and minimax procedures are briefly sketched. 

Throughout use is made of numerous examples to illustrate and emphasize, and each chapter is 
concluded with an extensive set of classified problems. Worthy of mention also are the references 
which are printed in such a way that they give a good guide to the related literature. As to the 
presentation of the material, a slight lack of explanation may be felt in some instances. However, 
although the whole field of testing statistical hypotheses is not covered, the book makes clear the 
author’s comprehensive study and research in this theory, and both the teacher and student in the 


field will find the book a most useful one. 6: ¥ tne 


Die axiomatischen Grundlagen einer allgemeinen Theorie des Messens. (Axiomatic 
foundations of the general theory of measurement). By J. Pranzacu. Wiirzburg: 
Physica Verlag. 1959. Pp. 63. DM 14. 


This is the first of a series of monographs from the Statistical Institute of the University of Vienna. 
It is especially appropriate to a pioneer publication in that it studies a problem which does not 
appear frequently in statistical literature. This is the problem of defining what are the essential 
properties of a ‘measurement’. The author says that it is common practice to require that measure- 
ment puts every member of a set M into the ‘strongest possible’ relationship with a set of real numbers 
(the measurements): and that, in most cases, it is required that some form of ‘additivity’ condition 
should be satisfied. The author claims that this last condition is unnecessarily restrictive, and suggests 
replacing it by a general ‘so-called metrical operation’—-continuous, monotonic and one-to-one. 

It is salutary to pay close attention, from time to time, to the true nature of measurements (the 
‘observational data’ or ‘sample values’ of statistics). Such works as the present monograph remind 
us that the very concept of ‘measurement’ is capable of modification, and perform a useful service in 
drawing attention to fundamentals. Many statisticians will find the present work rather abstract, but 
an understanding of the author’s aims and methods should repay the necessary effort. 

N. L. JOHNSON 


Statistical Estimates and Transformed Beta-Variables. By Gunnar Biom. New 
York: John Wiley and Sons, Inc.; Stockholm: Almquist and Wiksell; London: 
Chapman and Hall. 1958. Pp. 176. 40s. 

This book is presumably a published form of a doctoral thesis. The author states that he began by 


being interested in estimation by means of order statistics and was led from this (or perhaps one should 
say through this) to an investigation of the theory of estimation. The first two chapters consist of 
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a mathematical discussion of the Fisherian theory of estimation with some generalization in the 
direction which the author is going to require for the rest of his researches. In Chapter 3 he moves 
away from the usual terminology of order statistic to that of transformed beta-variable. It is well 
known that any order statistic has a beta-distribution when expressed in terms of its cumulative 
distribution function. It is doubtful whether this new nomenclature adds clarity to our ideas on the 
subject but the author thinks that he has generalized the concept by this means. 

From the change in nomenclature we go on to properties and approximate properties of transformed 
beta-variables, and of linear combinations of them. We then have the applications of the theory of 
linear estimation to these variables. This section of about 70 pages is important, particularly in 
connexion with the treatment of estimation in censored samples, a topic which has recently been 
receiving a certain amount of attention. 

The book throughout is written in highly condensed fashion with due attention to mathematical 
rigour. The reader envisaged can only be a mathematical statistician since no clear picture emerges as to 


possible practical applications. F. N. DAVID 


Statistics : An Introduction. By D. A.S. Fraser. New York: John Wiley and Sons Inc. ; 
London: Chapman and Hall. 1959. Pp. ix+396. 54s. 


This is yet another book on the mathematical theory of statistics. It is intended, the author tells 
us, as a text for a two-semester course fcr students with the prerequisite of a full year of calculus. 
(It might also have been added that some knowledge of matrix algebra is desirable.) The book follows 
the orthodox theme. The topics covered, in order of presentation, are discrete and continuous probability 
distributions, expectation techniques and generating functions, central limit theorem, single and 
two-way classification, x? and F, estimation, testing hypotheses, confidence intervals, regression 
analysis, factorial designs with one, two and three factors, randomization, analysis of covariance, and 
fractional replication, sequential analysis and non-parametric methods. The degree of sophistication 
required is a little more than that for A. M. Mood, Elementary Theory of Statistics. 

The level of exposition of this book is adequate and it will be found useful by mathematics students 
wanting to learn something of the mysteries of the statistical craft. If it does not add anything in the 
way of fresh theory or example to what has already been published this is not the fault of the author, 
but merely reflects the fact that a great many ‘introductions’ and ‘elements’ have already appeared. 
For those students who seek to buy a text-book on statistics for the first time this is a suitable book to 
buy. For those students who have already bought an introductory text-book stronger meat than this 


will probably be required. F. N. DAVID 


Operations Research. An Annotated Bibliography. Second edition. By James H. 
BatcHetor. Saint Louis, Missouri: Saint Louis University Press. 1959. $10. 


The price of this book is likely to put it out of the reach of private individuals, but it can be said 
without hesitation that this work should be in any library concerned with its subject. It lists 4195 
titles of papers or books concerned with operational research and in most cases gives an abstract of 
from 50 to 150 words (occasionally more) of the papers listed. The author refers in his introduction to 
paper 1321, by Merrill M. Flood in which it is said that ‘operations research represents nothing more 
than a name for the post-war increase in the quantity and quality of scientific research on industrial 
management and operating problems’, and in the light of this, completeness is impossible of attainment 
in such a bibliography. A notable omission is the absence of reference to the book by McGuire, 
Bechmann and Winsten on problems of transport, which is in many ways a model of how operational 
research should be carried out. And doubtless a further search would reveal further omissions. On the 
other hand, a notable inclusion is: 0060, E. D. Adrian, The Physical Background of Perception. 

The value of the bibliography is greatly enhanced by an index, mainly by subject, which occupies 
84 pages. 

Mr Batchelor has put us all in his debt by his painstaking work. G. A. BARNARD 
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Quality Control and Industrial Statistics. (Revised edition.) By AcnEson J. Duncan. 
Homewood, Illinois: Richard D. Irwin Inc. 1959. Pp. xxxiii+ 946. $10.00. 


This second enlarged and revised edition is to be very warmly welcomed. The main changes are the 
addition of 250 pages of text, the creation of a separate section for rectifying inspection, and an 
improved lay-out (with additions) for acceptance sampling by variables. For the rest, apart from some 
minor pruning and addition, and the compilation of more up-to-date reference lists, the text follows 
that of the first edition reviewed by Cox in Biometrika, volume 40. It is convenient to discuss the 
extra material , vith the approximate number of extra pages given in brackets) under the headings 
given in the book. 


Part II. Lot acceptance sampling plans. The enlarged section on sampling inspection by variables 
includes a description of the Mil. Std. 414, a good exposition of sequential probability ratio sampling 
plans, and a short chapter comparing sampling plans by the average sample numbers, and administra- 
tive features (40 pages). There is also a new chapter (9 pages) on the Barnard—Enters—Hamaker 
multiple sampling procedure. 

Part III. Rectifying inspection. This part contains material which was intermingled with other 
sampling plans in the first edition, and the creation of a separate part for rectifying inspection is a great 
improvement. Additional material includes more discussion on Dodge’s plans, with variation, similar 
schemes by Wald & Wolfowitz and Girshick, a brief survey of recent U.S. Government handbooks, 
and a discussion of Anscombe’s scheme (12 pages). 

Part IV. Control charts. The extra material in this part is mainly a discussion on the design of 
continuous sampling plans with a summary of a theoretical study by Duncan based on minimizing 
over-all cost, and a short summary of other control charts not given in detail (11 pages). 

Part V. Some statistics useful in industrial research. About two-thirds of the extra material is in 
this section. The analysis of variance alone has 70 extra pages, including general linear comparisons, 
nested classifications, subgrouping and ranking means, confidence intervals and tables of expected 
values of mean squares for various models of components of variance analysis. A new chapter on the 
analysis of covariance is very clear, including both theory and numerical examples (33 pages). An 
extra chapter on the design of experiments deals with confounding and fractional replication (14 pages). 
Lastly, there is a new chapter on mapping response surfaces, with a detailed description of the Box— 
Wilson technique (44 pages). The chapter is typical of the book, with a clear presentation of the theory 
illustrated with numerical examples. The additions to this last part of the book represent a considerable 
improvement on the previous edition. 

Many of Cox’s criticisms of the first edition still stand. The author has obviously considered and 
anticipated such criticisms, for in his preface he says: 

‘As it now stands, the text is reasonably comprehensive and integrated. It nevertheless includes 
very little on non-parametric tests, nothing on stratified and multistage sampling and a meagre 
amount of material on the economics of statistical procedures. Several excellent volumes are available, 
however, on stratified sampling and other aspects of sampling survey techniques, and with the recent 
appearance of two books devoted entirely to non-parametric methods, it was felt that neglect of this 
important subject was justified. Work on economics of statistical procedures is very new and does 
not yet appear to be sufficiently standardized for comprehensive treatment in a text.’ 

In this paragraph the author himself gives the main criticism liable to arise, and we may well not 
agree with his defence for omitting the important material to which he refers. The reviewer feels that 
for a reference book of this type, some small space at least could have been given to sampling techniques, 
and two or three of the most useful non-parametric techniques, such as the Fisher-Terry normalized 
ranks, and the Mann-Whitney and Kolmogorov—Smirnov tests would have added little extra to the 
length. Indeed, one could well argue that the 127 page introduction describing probability, frequency 


distributions, etc., would be better spent on such material. G. B. WETRERILL 


Health and Medical Care in New York. Report by Committee for the Special Research 
Project in the Health Insurance Plan of Greater New York. Harvard: University 
Press for Commonwealth Fund ; London : Oxford University Press. 1957. Pp. 275. 60s. 

The Health Insurance Plan for New York is an organization which provides, on a pre-paid insurance 


basis, medical services for large groups of workers and their families. The doctors concerned work in 
partnerships or group practices and are paid a capitation fee. The dearth of actuarial data in the United 
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States on the demands and costs of such a service led to this survey of the working of the service using 
both the routinely kept records of the participating physicians and a sampling inquiry designed to 
elicit the medical needs and demands of both the families enrolled in the plan and of other families in 
New York. At the same time, an attempt was made to estimate the quality of medical care received 
by both population groups, either from the Health Insurance Plan and from other sources. There was 
also some interest in the possibility of using the H.I.P. experience as a guide to the medical service needs 
of the whole New York population. 

The sampling methods used are discussed in some detail. From the total H.I.P. population, a 10% 
quasi-random sample was drawn by selecting each family whose enrolment certificate number ended 
in 5. The remainder of the New York population was sampled on an area probability basis; from 1000 
city blocks, divided into fifty strata each of twenty blocks according to population characteristics such 
as density and recent increase, were drawn a proportion of families to give a total sample of 5000. 
The practical difficulties encountered in carrying out this plan are very fully and frankly discussed and 
the derivation of sampling errors for the results eventually obtained is clearly set out. 

The design of the questionnaire used in the field survey owes much to the wide experience in this 
field which has been gathered by the U.S. Bureau of the Census and the detailed instructions to inter- 
viewers on the interpretation and coding of alternative answers form a most useful model for others 
venturing into this type of medical survey. Tabulation followed a prearranged scheme designed to 
answer the main questions posed by the guiding committee and to provide appropriate cross-checks 
on the accuracy of the information obtained. The preliminary analyses showed that there were severe 
practical limitations on the accuracy of certain types of data, such as income, given by the single 
informant for each family. On the other hand, comparisons between some of the results obtained and 
other official data showed a satisfying consistency. 

The information on the prevalence and incidence of acute and chronic disease has its own intrinsic 
interest and the differences in their use of medical facilities between the samples is most interestingly 
discussed. The emphasis in this discussion is administrative rather than clinical: hospital admissions 
and frequency of dental treatment in relation to the educational status of the head of the household 
are typical subjects of particular interest. As an extremely competent description of a medico-social 
survey this book is to be commended. Lavishly produced, 275 pages long with much tabulated material, 
it may have a limited audience, but it should be included in any specialized library of reference texts 


on medical or social surveys. D. D. REID 


Tuberculosis in White and Negro Children. Harvard: University Press for Common- 
wealth Fund; London: Oxford University Press. 1958. 1. The Roentgenologic 
Aspects of the Harriet Lane Study. By Janet B. Harpy. Pp. 122. 60s. 2. The 
Epidemiological Aspects of the Harriet Lane Study. By Miriam E. BralLey. 
Pp. 103. 36s. 


This study of tuberculosis in childhood is published in two volumes and is based on the wide 
experience of the disease among 1329 children, 437 from 180 white families, and 892 from 327 negro 
families. These were patients in the special Tuberculosis Clinic of the Harriet Lane Home of the Johns 
Hopkins Hospital in Baltimore. This Home was established in 1928 with such a study of the natural 
history of tuberculosis in children in mind and it is clear from these volumes that an immense effort 
has gone into the realization of this scheme. On the other hand, it should be pointed out at the onset 
that the area served was one of the poorest in Baltimore and both white and negro femilies lived in 
crowded and often insanitary homes. Again, the period covered, from 1928 to 1950, preceded the 
development and wide use of the powerful new antibiotics against tuberculosis which have radically 
altered the outlook in this disease. For these reasons, some at least of the findings cannot be directly 
applied to other communities or to the present time. Nevertheless, there is much of value in these 
volumes that will repay attention. 

From the viewpoint of the biometrician new to the field of tuberculosis the first volume will provide 
a very clear introduction to the clinical concepts and terms in current use. Dr Janet B. Hardy, the 
present Director of the Clinic, gives a commentary on an excellent series of chest roentogenograms of 
a wide range of tuberculous conditions pointing out the characteristics important in their interpretation 
and classification. The second section of Volume I discusses the indications and techniques of broncho- 
scopy and bronchography. 
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Volume II, written by a previous Director, Dr Miriam E. Brailey, deals with the epidemiological 
aspects of the study. Since this work was done in collaboration with the Department of Epidemiology 
at Johns Hopkins, it necessarily reflects the concepts and technical methods developed by Wade 
Hampton Frost and his successors. This means a careful and critical attention to detail and the use of 
life-table techniques in the analysis of survival rates and reinfection rates in varying circumstances. 
There is a good discussion of the relevance of the findings to the vexed problem of the hazards of 
superinfection and although some may dispute a few of the conclusions this text is to be recommended 


to any serious student of the complexities of the epidemiology of tuberculosis. >. D. REID 


The Structure of Arithmetic and Algebra. By M. H. Maria. New York: John Wiley 
and Sons Inc.; London: Chapman and Hall Ltd. 1958. Pp. vii+ 294. 48s. 


In 1547 William Buckley, Englishman, wrote a treatise entitled The Rules of Arithmetic, which 
[treatise] is noted chiefly for a short discussion on combinations. Much of the book, however, is con- 
cerned with rules for writing down numbers, for addition and subtraction and the other arithmetic 
operations. An expatriate from Mars who left England in 1547 might accordingly be predisposed to 
wonder at the lack of progress of English science since Prof. Maria is concerned to discuss many of the 
same topics as Buckley. Prof. Maria is, in fact, the latest of a long line of mathematical logicians who 
are concerned to describe on what mathematical thought is based. She does not waste much space on 
what is a number but moves briskly through the operations of addition, subtraction, multiplication and 
division. There are chapters on rational numbers, and actual numbers, irrational numbers and the 
properties of integers. In the generalization of the distributive law a little combinatorial theory 
is introduced. 

This book may be interesting to the beginner in mathematical lore if he is prepared to recognize that 
in the end he will have to take almost everything here discussed for granted. For the statistician, who 
in many circumstances would be hard pressed to defend the logic of his operations, it is unlikely that 


this book will have much appeal. Fr. ¥. DAVID 


Tables of the Normal Probability Integral. (Sandia Corporation Technical Memo- 


randum.) By D. T. Monk and D. B. Owrn. U.S. Department of Commerce: Office 
of Technical Services. 1957. Pp. 58. 40 cents. 


h 
The function Gh) = Jen! e-32* dx 
-2 


is tabulated to eight decimal places for h = 0-000 (0-001) 7-000. An auxiliary table of Mills’ Ratio 


1 0 1 
M(h) = —— I et" da / e-"" 
V(27) Jan (27) 
is given to ten decimal places in M(h) and for h = 50(1) 500. 





The Bivariate Normal Probability Distribution. (Sandia Corporation Research 
Report.) By D. B. Owen. U.S. Department of Commerce: Office of Technical 
Services. 1957. Pp. 136. 65 cents. 


Various methods have been used to calculate bivariate normal probabilities where the boundaries 
of the region over which the probabilities are to be computed are straight lines. This publication is mainly 
concerned with the use and tabulation of an auxiliary function T(h, a) defined below. If 
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. 
and G(h) = Tan e-*! dt, 


h k 
then Bh, k; p) = | | S(x,y; p)dady 
—ao,J —2 


can be put in terms of G(h), G(k) and a function 
wane 1 (4e-tha+2*) 
6) = oa) Tat 
In a 76 pp. Appendix, T'(h, a) is tabulated to six decimal places for 


a/h =0-000 (0-025) 1-000 and h = 0-00(0-01) 4-75. 


dz. 


Tables of the Bivariate Normal Distribution Function and Related Functions. 
Computed and collated under the direction of GzerTRUDE Buancu. U.S. Department 
of Commerce: National Bureau of Standards. Applied Mathematics Series 50. 1959. 

Pp. Ixv + 258. $3-25 
This is a definitive table of bivariate normal distribution, amplifying and extending the original 


tables computed under Karl Pearson’s direction, which were brought together in Tables for Statisticians 
and Biometricians, 2, now out of print. Write 


io 9) eo 
(hy bsp) = | ae{ g(x, y, p) dy, 
t fi 


1 
. oN ae ie ee y2)/(1 — 92 
where Yr, ¥,p) = a= pa PL 3(x* — 2paxy + y*)/(1—p?)] 
1 h : ka|h : 
and V(h,k) = sat e-t* ae{ e-tv’ dy. 
, 27 0 0 


Then the volume contains the following tables in addition to a 45 pp. Introduction. 

Table I. Six decimal places given: L(h, k, p) for h,k = 0(0-1) 4; p = 0(0-05) 0-95 (0-01) 1. 

Table II. Seven decimal places given: L(h,k, —p) for h,k = 0(0-1)h,,k,; p = 0(0-05) 0-95 (0-01) 1, 
where L(h,, k,, —p) < $x 10~7 if h, and k, are both less than 4. 

Table III. Seven decimal places given: V(h,Ah) for A = 0-1(0-1) 1; h = 0(0-01) 4 (0-02) 4-6 (0-1) 5-6 
and oo. 

Table IV. Seven decimal places given: V(Ah, h) for A = 0-1(0-1) 1; h = 0(0-01) 4(0-02) 5-6 and oo. 

Table V. Eight decimal places given: y = (arc sinr)/(27); r = 0(0-01) 1 (second central differences). 


Tables of natural logarithms for arguments between five and ten to sixteen 
decimal places. U.S. Department of Commerce: National Bureau of Standards 
Applied Mathematics 53. 1958. (Supersedes Mathematical Table 12). Pp. xiii+ 506. $4. 


This table is a reissue of Volume 4 of a four-volume table of logarithms published in 1941. Log, x is 
tabulated for x = 5-0000 (0-0001) 10-0000 to sixteen decimal places. No differences are given. 
The companion volume to this, log, x for x = 0-0000 (0-0001) 5-0000 was noticed in Biometrika, 41, 286. 


Tables and Percentage Points of the Distribution Function of a Product. By Lro 
A. Aroran. California: Hughes Aircraft Company. 1957. Pp. ix+99. 


x and y are two normal uncorrelated variables with means m, and m, and variances o} and o3, respec- 
tively. 6; = m,/o,; (i = 1,2). (i) The 100« percentage points of the distribution of z = xy/(o,0,) are 
tabulated for different values of 6; (¢ = 1,2). (ii) The cumulative distribution function of z is also 
tabulated. 

For (i), @ = 0-001, 0-605, 0-010, 0-025, 0-050, 0-100, 0-250, 0-500, 0-750, 0-900, 0-950, 0-975, 0-990, 0-995, 
0-999 and 6,, 6, = (0-4) 4-0, 6-0, 12-0. Four decimal places are tabulated. 

For (ii) 0,, 6, = 0 (0-4) 4-0, 6-0, 12-0 and F(z) = F(ay) is tabulated against z to six decimal places. The 
argument of z varies, but in the middle range between z = — 4-0 and 4-0 the argument interval is 0-2 
or 0-1. 
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Rectangular-Polar Conversion Table. (Royal Society Mathematical Tables, Vol. 2.) 
By E. H. Nevitie. Cambridge University Press for the Royal Society. 1956. 
Pp. xxii+ 109. 30s. 
Given x = rcos@ and y = rsin @, this table is concerned with the evaluation of r and of @ (in 
degrees) and log r and of @ (in radians) for different values of 2 and y. 
For a = 1(1)105 and y = x(1)105, log, r is tabulated to 15 decimal places and @ is given in radians 
to 15 decimal places. 
For x = 1(1)105 and y = 1(1)a, r is tabulated to 13 decimal places and @ is given in degrees to 
13 decimal places. 
Auxiliary tables are the natural logarithms of the integers 1 (1) 160 to 15 decimal places, and integral 
rmaultiples of log, 10 for 1(1)20. Tables to assist in interpolation are also appended. 


Corrigenda 


In the Review section of the last issue of Biometrika (1959), 46, 494, the price of 
Wahrscheinlichkeitsrechnung und Mathematische Statistik, by M. Fisz (Berlin: VED Deutscher 
Verlag der Wissenschaften) was quoted wrongly as DM. 14; the correct price is DM. 36. 
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Sampling Inspection Tables: Single and Double Sampling. Second edition. By H. F. Dopcr 
and H. G. Romie. New York: John Wiley and Sons Inc.; London: Chapman and Hall Ltd. 1959. 
Pp. 224. 64s. 


Modern Statistical Methods: Descriptive and Inductive. By P. O. Jonnson and R. W. B. 
Jackson. Chicago: Rand McNally and Company. 1959. Pp. 514. $8.00. 


Introduction to Probability and Statistics. By G. W. McE.terars and B. W. LinpGren. New 
York and London: The Macmillan Company, New York. 1959. Pp. 277. 44s. 


Individual Choice Behaviour. By R. Duncan Luce. New York: John Wiley and Sons Inc.; 
London: Chapman and Hall Ltd. 1959. Pp. 153. 48s. 


Dynamic Management Decision Games. By J. GREENE and R. Sisson. New York: John Wiley 
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